| Why Generics? |
| 31 Jul 2019 |
| Tags: go2, proposals, generics |
| |
| Ian Lance Taylor |
| |
| * Introduction |
| |
| This is the blog post version of my talk last week at Gophercon 2019. |
| |
| .iframe //www.youtube.com/embed/WzgLqE-3IhY?rel=0 309 549 |
| |
| This article is about what it would mean to add generics to Go, and |
| why I think we should do it. |
| I'll also touch on an update to a possible design for |
| adding generics to Go. |
| |
| Go was released on November 10, 2009. |
| Less than 24 hours later we saw the |
| [[https://groups.google.com/d/msg/golang-nuts/70-pdwUUrbI/onMsQspcljcJ][first comment about generics]]. |
| (That comment also mentions exceptions, which we added to the |
| language, in the form of `panic` and `recover`, in early 2010.) |
| |
| In three years of Go surveys, lack of generics has always been listed |
| as one of the top three problems to fix in the language. |
| |
| * Why generics? |
| |
| But what does it mean to add generics, and why would we want it? |
| |
| To paraphrase |
| [[https://www.dagstuhl.de/en/program/calendar/semhp/?semnr=98171][Jazayeri, et al]]: |
| generic programming enables the representation of functions and data |
| structures in a generic form, with types factored out. |
| |
| What does that mean? |
| |
| For a simple example, let's assume we want to reverse the elements in |
| a slice. |
| It's not something that many programs need to do, but it's |
| not all that unusual. |
| |
| Let's say it's a slice of int. |
| |
| func ReverseInts(s []int) { |
| first := 0 |
| last := len(s) |
| for first < last { |
| s[first], s[last] = s[last], s[first] |
| first++ |
| last-- |
| } |
| } |
| |
| Pretty simple, but even for a simple function like that you'd want to |
| write a few test cases. |
| In fact, when I did, I found a bug. |
| I'm sure many readers have spotted it already. |
| |
| func ReverseInts(s []int) { |
| first := 0 |
| last := len(s) - 1 |
| for first < last { |
| s[first], s[last] = s[last], s[first] |
| first++ |
| last-- |
| } |
| } |
| |
| We need to subtract 1 when we set the variable last. |
| |
| Now let's reverse a slice of string. |
| |
| func ReverseStrings(s []string) { |
| first := 0 |
| last := len(s) - 1 |
| for first < last { |
| s[first], s[last] = s[last], s[first] |
| first++ |
| last-- |
| } |
| } |
| |
| If you compare `ReverseInts` and `ReverseStrings`, you'll see that the |
| two functions are exactly the same, except for the type of the parameter. |
| I don't think any reader is surprised by that. |
| |
| What some people new to Go find surprising is that there is no way to |
| write a simple `Reverse` function that works for a slice of any type. |
| |
| Most other languages do let you write that kind of function. |
| |
| In a dynamically typed language like Python or JavaScript you can |
| simply write the function, without bothering to specify the element |
| type. This doesn't work in Go because Go is statically typed, and |
| requires you to write down the exact type of the slice and the type of |
| the slice elements. |
| |
| Most other statically typed languages, like C++ or Java or Rust or |
| Swift, support generics to address exactly this kind of issue. |
| |
| * Go generic programming today |
| |
| So how do people write this kind of code in Go? |
| |
| In Go you can write a single function that works for different slice |
| types by using an interface type, and defining a method on the slice |
| types you want to pass in. |
| That is how the standard library's `sort.Sort` function works. |
| |
| In other words, interface types in Go are a form of generic |
| programming. |
| They let us capture the common aspects of different types and express |
| them as methods. |
| We can then write functions that use those interface types, and those |
| functions will work for any type that implements those methods. |
| |
| But this approach falls short of what we want. |
| With interfaces you have to write the methods yourself. |
| It's awkward to have to define a named type with a couple of methods |
| just to reverse a slice. |
| And the methods you write are exactly the same for each slice type, so |
| in a sense we've just moved and condensed the duplicate code, we |
| haven't eliminated it. |
| Although interfaces are a form of generics, they don’t give us |
| everything we want from generics. |
| |
| A different way of using interfaces for generics, which could get around |
| the need to write the methods yourself, would be to have the language |
| define methods for some kinds of types. |
| That isn't something the language supports today, but, for example, |
| the language could define that every slice type has an Index method |
| that returns an element. |
| But in order to use that method in practice it would have to return an |
| empty interface type, and then we lose all the benefits of static |
| typing. |
| More subtly, there would be no way to define a generic function that |
| takes two different slices with the same element type, or that takes a |
| map of one element type and returns a slice of the same element type. |
| Go is a statically typed language because that makes it easier to |
| write large programs; we don’t want to lose the benefits of static |
| typing in order to gain the benefits of generics. |
| |
| Another approach would be to write a generic `Reverse` function using |
| the reflect package, but that is so awkward to write and slow to run |
| that few people do that. |
| That approach also requires explicit type assertions and has no static |
| type checking. |
| |
| Or, you could write a code generator that takes a type and generates a |
| `Reverse` function for slices of that type. |
| There are several code generators out there that do just that. |
| But this adds another step to every package that needs `Reverse`, |
| it complicates the build because all the different copies have to be |
| compiled, and fixing a bug in the master source requires re-generating |
| all the instances, some of which may be in different projects |
| entirely. |
| |
| All these approaches are awkward enough that I think most |
| people who have to reverse a slice in Go just write the function for |
| the specific slice type that they need. |
| Then they'll need to write test cases for the function, to make sure |
| they didn't make a simple mistake like the one I made initially. |
| And they'll need to run those tests routinely. |
| |
| However we do it, it means a lot of extra work just for a function that |
| looks exactly the same except for the element type. |
| It's not that it can't be done. |
| It clearly can be done, and Go programmers are doing it. |
| It's just that there ought to be a better way. |
| |
| For a statically typed language like Go, that better way is generics. |
| What I wrote earlier is that generic programming enables the |
| representation of functions and data structures in a generic form, |
| with types factored out. |
| That's exactly what we want here. |
| |
| * What generics can bring to Go |
| |
| The first and most important thing we want from generics in Go is to |
| be able to write functions like `Reverse` without caring about the |
| element type of the slice. |
| We want to factor out that element type. |
| Then we can write the function once, write the tests once, put them in |
| a go-gettable package, and call them whenever we want. |
| |
| Even better, since this is an open source world, someone else can |
| write `Reverse` once, and we can use their implementation. |
| |
| At this point I should say that “generics” can mean a lot of different |
| things. |
| In this article, what I mean by “generics” is what I just described. |
| In particular, I don’t mean templates as found in the C++ language, |
| which support quite a bit more than what I’ve written here. |
| |
| I went through `Reverse` in detail, but there are many other functions |
| that we could write generically, such as: |
| |
| - Find smallest/largest element in slice |
| - Find average/standard deviation of slice |
| - Compute union/intersection of maps |
| - Find shortest path in node/edge graph |
| - Apply transformation function to slice/map, returning new slice/map |
| |
| These examples are available in most other languages. |
| In fact, I wrote this list by glancing at the C++ standard template |
| library. |
| |
| There are also examples that are specific to Go with its strong |
| support for concurrency. |
| |
| - Read from a channel with a timeout |
| - Combine two channels into a single channel |
| - Call a list of functions in parallel, returning a slice of results |
| - Call a list of functions, using a Context, return the result of the first function to finish, canceling and cleaning up extra goroutines |
| |
| I've seen all of these functions written out many times with different |
| types. |
| It's not hard to write them in Go. |
| But it would be nice to be able to reuse an efficient and debugged |
| implementation that works for any value type. |
| |
| To be clear, these are just examples. |
| There are many more general purpose functions that could be written |
| more easily and safely using generics. |
| |
| Also, as I wrote earlier, it's not just functions. |
| It's also data structures. |
| |
| Go has two general purpose generic data structures built into the |
| language: slices and maps. |
| Slices and maps can hold values of any data type, with static type |
| checking for values stored and retrieved. |
| The values are stored as themselves, not as interface types. |
| That is, when I have a `[]int`, the slice holds ints directly, not |
| ints converted to an interface type. |
| |
| Slices and maps are the most useful generic data structures, but they |
| aren’t the only ones. |
| Here are some other examples. |
| |
| - Sets |
| - Self-balancing trees, with efficient insertion and traversal in sorted order |
| - Multimaps, with multiple instances of a key |
| - Concurrent hash maps, supporting parallel insertions and lookups with no single lock |
| |
| If we can write generic types, we can define new data structures, like |
| these, that have the same type-checking advantages as slices and maps: |
| the compiler can statically type-check the types of the values that |
| they hold, and the values can be stored as themselves, not as |
| interface types. |
| |
| It should also be possible to take algorithms like the ones mentioned |
| earlier and apply them to generic data structures. |
| |
| These examples should all be just like `Reverse`: generic functions |
| and data structures written once, in a package, and reused whenever |
| they are needed. |
| They should work like slices and maps, in that they shouldn't store |
| values of empty interface type, but should store specific types, and |
| those types should be checked at compile time. |
| |
| So that's what Go can gain from generics. |
| Generics can give us powerful building blocks that let us share code |
| and build programs more easily. |
| |
| I hope I’ve explained why this is worth looking into. |
| |
| * Benefits and costs |
| |
| But generics don't come from the |
| [[https://mainlynorfolk.info/folk/songs/bigrockcandymountain.html][Big Rock Candy Mountain]], |
| the land where the sun shines every day over the |
| [[http://www.lat-long.com/Latitude-Longitude-773297-Montana-Lemonade_Springs.html][lemonade springs]]. |
| Every language change has a cost. |
| There's no doubt that adding generics to Go will make the language |
| more complicated. |
| As with any change to the language, we need to talk about maximizing |
| the benefit and minimizing the cost. |
| |
| In Go, we’ve aimed to reduce complexity through independent, orthogonal |
| language features that can be combined freely. |
| We reduce complexity by making the individual features simple, and we |
| maximize the benefit of the features by permitting their free |
| combination. |
| We want to do the same with generics. |
| |
| To make this more concrete I’m going to list a few guidelines we |
| should follow. |
| |
| *** Minimize new concepts |
| |
| We should add as few new concepts to the language as possible. |
| That means a minimum of new syntax and a minimum of new keywords and |
| other names. |
| |
| *** Complexity falls on the writer of generic code, not the user |
| |
| As much as possible the complexity should fall on the programmer |
| writing the generic package. |
| We don't want the user of the package to have to worry about generics. |
| This means that it should be possible to call generic functions in a |
| natural way, and it means that any errors in using a generic package |
| should be reported in a way that is easy to understand and to fix. |
| It should also be easy to debug calls into generic code. |
| |
| *** Writer and user can work independently |
| |
| Similarly, we should make it easy to separate the concerns of the |
| writer of the generic code and its user, so that they can develop their |
| code independently. |
| They shouldn't have to worry about what the other is doing, any more |
| than the writer and caller of a normal function in different packages |
| have to worry. |
| This sounds obvious, but it's not true of generics in every other |
| programming language. |
| |
| *** Short build times, fast execution times |
| |
| Naturally, as much as possible, we want to keep the short build times |
| and fast execution time that Go gives us today. |
| Generics tend to introduce a tradeoff between fast builds and fast |
| execution. |
| As much as possible, we want both. |
| |
| *** Preserve clarity and simplicity of Go |
| |
| Most importantly, Go today is a simple language. |
| Go programs are usually clear and easy to understand. |
| A major part of our long process of exploring this space has been |
| trying to understand how to add generics while preserving that clarity |
| and simplicity. |
| We need to find mechanisms that fit well into the existing language, |
| without turning it into something quite different. |
| |
| These guidelines should apply to any generics implementation in Go. |
| That’s the most important message I want to leave you with today: |
| *generics*can*bring*a*significant*benefit*to*the*language,*but*they*are*only*worth*doing*if*Go*still*feels*like*Go*. |
| |
| * Draft design |
| |
| Fortunately, I think it can be done. |
| To finish up this article I’m going to shift from discussing why we |
| want generics, and what the requirements on them are, to briefly |
| discuss a design for how we think we can add them to the language. |
| |
| At this year's Gophercon Robert Griesemer and I published |
| [[https://github.com/golang/proposal/blob/master/design/go2draft-contracts.md][a design draft]] |
| for adding generics to Go. |
| See the draft for full details. |
| I'll go over some of the main points here. |
| |
| Here is the generic Reverse function in this design. |
| |
| func Reverse (type Element) (s []Element) { |
| first := 0 |
| last := len(s) - 1 |
| for first < last { |
| s[first], s[last] = s[last], s[first] |
| first++ |
| last-- |
| } |
| } |
| |
| You'll notice that the body of the function is exactly the same. |
| Only the signature has changed. |
| |
| The element type of the slice has been factored out. |
| It's now named `Element` and has become what we call a |
| _type_parameter_. |
| Instead of being part of the type of the slice parameter, it's now a |
| separate, additional, type parameter. |
| |
| To call a function with a type parameter, in the general case you pass |
| a type argument, which is like any other argument except that it's a |
| type. |
| |
| func ReverseAndPrint(s []int) { |
| Reverse(int)(s) |
| fmt.Println(s) |
| } |
| |
| That is the `(int)` seen after `Reverse` in this example. |
| |
| Fortunately, in most cases, including this one, the compiler can |
| deduce the type argument from the types of the regular arguments, and |
| you don't need to mention the type argument at all. |
| |
| Calling a generic function just looks like calling any other function. |
| |
| func ReverseAndPrint(s []int) { |
| Reverse(s) |
| fmt.Println(s) |
| } |
| |
| In other words, although the generic `Reverse` function is slightly |
| more complex than `ReverseInts` and `ReverseStrings`, that complexity |
| falls on the writer of the function, not the caller. |
| |
| ** Contracts |
| |
| Since Go is a statically typed language, we have to talk about the |
| type of a type parameter. |
| This _meta-type_ tells the compiler what sorts of type arguments are |
| permitted when calling a generic function, and what sorts of |
| operations the generic function can do with values of the type |
| parameter. |
| |
| The `Reverse` function can work with slices of any type. |
| The only thing it does with values of type `Element` is assignment, |
| which works with any type in Go. |
| For this kind of generic function, which is a very common case, we |
| don't need to say anything special about the type parameter. |
| |
| Let's take a quick look at a different function. |
| |
| func IndexByte (type T Sequence) (s T, b byte) int { |
| for i := 0; i < len(s); i++ { |
| if s[i] == b { |
| return i |
| } |
| } |
| return -1 |
| } |
| |
| Currently both the bytes package and the strings package in the |
| standard library have an `IndexByte` function. |
| This function returns the index of `b` in the sequence `s`, where `s` |
| is either a `string` or a `[]byte`. |
| We could use this single generic function to replace the two functions |
| in the bytes and strings packages. |
| In practice we may not bother doing that, but this is a useful simple |
| example. |
| |
| Here we need to know that the type parameter `T` acts like a `string` |
| or a `[]byte`. |
| We can call `len` on it, and we can index to it, and we can compare |
| the result of the index operation to a byte value. |
| |
| To let this compile, the type parameter `T` itself needs a type. |
| It's a meta-type, but because we sometimes need to describe multiple |
| related types, and because it describes a relationship between the |
| implementation of the generic function and its callers, we actually |
| call the type of `T` a contract. |
| Here the contract is named `Sequence`. |
| It appears after the list of type parameters. |
| |
| This is how the Sequence contract is defined for this example. |
| |
| contract Sequence(T) { |
| T string, []byte |
| } |
| |
| It's pretty simple, since this is a simple example: the type parameter |
| `T` can be either `string` or `[]byte`. |
| Here `contract` may be a new keyword, or a special identifier |
| recognized in package scope; see the design draft for details. |
| |
| Anybody who remembers [[https://github.com/golang/proposal/blob/4a530dae40977758e47b78fae349d8e5f86a6c0a/design/go2draft-contracts.md][the design we presented at Gophercon 2018]] |
| will see that this way of writing a contract is a lot simpler. |
| We got a lot of feedback on that earlier design that contracts were |
| too complicated, and we've tried to take that into account. |
| The new contracts are much simpler to write, and to read, and to |
| understand. |
| |
| They let you specify the underlying type of a type parameter, and/or |
| list the methods of a type parameter. |
| They also let you describe the relationship between different type |
| parameters. |
| |
| ** Contracts with methods |
| |
| Here is another simple example, of a function that uses the String |
| method to return a `[]string` of the string representation of all the |
| elements in `s`. |
| |
| func ToStrings (type E Stringer) (s []E) []string { |
| r := make([]string, len(s)) |
| for i, v := range s { |
| r[i] = v.String() |
| } |
| return r |
| } |
| |
| It's pretty straightforward: walk through the slice, call the `String` |
| method on each element, and return a slice of the resulting strings. |
| |
| This function requires that the element type implement the `String` |
| method. |
| The Stringer contract ensures that. |
| |
| contract Stringer(T) { |
| T String() string |
| } |
| |
| The contract simply says that `T` has to implement the `String` |
| method. |
| |
| You may notice that this contract looks like the `fmt.Stringer` |
| interface, so it's worth pointing out that the argument of the |
| `ToStrings` function is not a slice of `fmt.Stringer`. |
| It's a slice of some element type, where the element type implements |
| `fmt.Stringer`. |
| The memory representation of a slice of the element type and a slice |
| of `fmt`.Stringer are normally different, and Go does not support |
| direct conversions between them. |
| So this is worth writing, even though `fmt.Stringer` exists. |
| |
| ** Contracts with multiple types |
| |
| Here is an example of a contract with multiple type parameters. |
| |
| type Graph (type Node, Edge G) struct { ... } |
| |
| contract G(Node, Edge) { |
| Node Edges() []Edge |
| Edge Nodes() (from Node, to Node) |
| } |
| |
| func New (type Node, Edge G) (nodes []Node) *Graph(Node, Edge) { |
| ... |
| } |
| |
| func (g *Graph(Node, Edge)) ShortestPath(from, to Node) []Edge { |
| ... |
| } |
| |
| Here we're describing a graph, built from nodes and edges. |
| We're not requiring a particular data structure for the graph. |
| Instead, we're saying that the `Node` type has to have an `Edges` |
| method that returns the list of edges that connect to the `Node`. |
| And the `Edge` type has to have a `Nodes` method that returns the two |
| `Nodes` that the `Edge` connects. |
| |
| I've skipped the implementation, but this shows the signature of a |
| `New` function that returns a `Graph`, and the signature of a |
| `ShortestPath` method on `Graph`. |
| |
| The important takeaway here is that a contract isn't just about a |
| single type. It can describe the relationships between two or more |
| types. |
| |
| ** Ordered types |
| |
| One surprisingly common complaint about Go is that it doesn't have a |
| `Min` function. |
| Or, for that matter, a `Max` function. |
| That's because a useful `Min` function should work for any ordered |
| type, which means that it has to be generic. |
| |
| While `Min` is pretty trivial to write yourself, any useful generics |
| implementation should let us add it to the standard library. |
| This is what it looks like with our design. |
| |
| func Min (type T Ordered) (a, b T) T { |
| if a < b { |
| return a |
| } |
| return b |
| } |
| |
| The `Ordered` contract says that the type T has to be an ordered type, |
| which means that it supports operators like less than, greater than, |
| and so forth. |
| |
| contract Ordered(T) { |
| T int, int8, int16, int32, int64, |
| uint, uint8, uint16, uint32, uint64, uintptr, |
| float32, float64, |
| string |
| } |
| |
| The `Ordered` contract is just a list of all the ordered types that |
| are defined by the language. |
| This contract accepts any of the listed types, or any named type whose |
| underlying type is one of those types. |
| Basically, any type you can use with the less than operator. |
| |
| It turns out that it's much easier to simply enumerate the types that |
| support the less than operator than it is to invent a new notation |
| that works for all operators. |
| After all, in Go, only built-in types support operators. |
| |
| This same approach can be used for any operator, or more generally |
| to write a contract for any generic function intended to work with |
| builtin types. |
| It lets the writer of the generic function specify clearly the set of |
| types the function is expected to be used with. |
| It lets the caller of the generic function clearly see whether the |
| function is applicable for the types being used. |
| |
| In practice this contract would probably go into the standard library. |
| and so really the Min function (which will probably also be in the |
| standard library somewhere) will look like this. Here we're just |
| referring to the contract Ordered defined in the package contracts. |
| |
| func Min (type T contracts.Ordered) (a, b T) T { |
| if a < b { |
| return a |
| } |
| return b |
| } |
| |
| ** Generic data structures |
| |
| Finally, let's look at a simple generic data structure, a binary |
| tree. In this example the tree has a comparison function, so there |
| are no requirements on the element type. |
| |
| type Tree (type E) struct { |
| root *node(E) |
| compare func(E, E) int |
| } |
| |
| type node (type E) struct { |
| val E |
| left, right *node(E) |
| } |
| |
| Here is how to create a new binary tree. |
| The comparison function is passed to the `New` function. |
| |
| func New (type E) (cmp func(E, E) int) *Tree(E) { |
| return &Tree(E){compare: cmp} |
| } |
| |
| An unexported method returns a pointer either to the slot holding v, |
| or to the location in the tree where it should go. |
| |
| func (t *Tree(E)) find(v E) **node(E) { |
| pn := &t.root |
| for *pn != nil { |
| switch cmp := t.compare(v, (*pn).val); { |
| case cmp < 0: |
| pn = &(*pn).left |
| case cmp > 0: |
| pn = &(*pn).right |
| default: |
| return pn |
| } |
| } |
| return pn |
| } |
| |
| |
| The details here don't really matter, especially since I haven't |
| tested this code. |
| I'm just trying to show what it looks like to write a simple generic |
| data structure. |
| |
| This is the code for testing whether the tree contains a value. |
| |
| func (t *Tree(E)) Contains(v E) bool { |
| return *t.find(e) != nil |
| } |
| |
| This is the code for inserting a new value. |
| |
| func (t *Tree(E)) Insert(v E) bool { |
| pn := t.find(v) |
| if *pn != nil { |
| return false |
| } |
| *pn = &node(E){val: v} |
| return true |
| } |
| |
| Notice the type argument `E` to the type `node`. |
| This is what it looks like to write a generic data structure. |
| As you can see, it looks like writing ordinary Go code, except that |
| some type arguments are sprinkled in here and there. |
| |
| Using the tree is pretty simple. |
| |
| var intTree = tree.New(func(a, b int) int { return a - b }) |
| |
| func InsertAndCheck(v int) { |
| intTree.Insert(v) |
| if !intTree.Contains(v) { |
| log.Fatalf("%d not found after insertion", v) |
| } |
| } |
| |
| That's as it should be. |
| It's a bit harder to write a generic data structure, because you often |
| have to explicitly write out type arguments for supporting types, but |
| as much as possible using one is no different from using an ordinary |
| non-generic data structure. |
| |
| ** Next steps |
| |
| We are working on actual implementations to allow us to experiment |
| with this design. |
| It's important to be able to try out the design in practice, to make |
| sure that we can write the kinds of programs we want to write. |
| It hasn't gone as fast as we'd hoped, but we'll send out more detail |
| on these implementations as they become available. |
| |
| Robert Griesemer has written a |
| [[https://golang.org/cl/187317][preliminary CL]] |
| that modifies the go/types package. |
| This permits testing whether code using generics and contracts can |
| type check. |
| It’s incomplete right now, but it mostly works for a single package, |
| and we’ll keep working on it. |
| |
| What we'd like people to do with this and future implementations is to |
| try writing and using generic code and see what happens. |
| We want to make sure that people can write the code they need, and |
| that they can use it as expected. |
| Of course not everything is going to work at first, and as we explore |
| this space we may have to change things. |
| And, to be clear, we're much more interested in feedback on the |
| semantics than on details of the syntax. |
| |
| I’d like to thank everyone who commented on the earlier design, and |
| everyone who has discussed what generics can look like in Go. |
| We’ve read all of the comments, and we greatly appreciate the work |
| that people have put into this. |
| We would not be where we are today without that work. |
| |
| Our goal is to arrive at a design that makes it possible to write the |
| kinds of generic code I’ve discussed today, without making the |
| language too complex to use or making it not feel like Go anymore. |
| We hope that this design is a step toward that goal, and we expect to |
| continue to adjust it as we learn, from our experiences and yours, |
| what works and what doesn’t. |
| If we do reach that goal, then we’ll have something that we can |
| propose for future versions of Go. |