| # Generalized Types |
| |
| This is a proposal for adding generics to Go, written by Ian Lance |
| Taylor in March, 2011. |
| This proposal will not be adopted. |
| It is being presented as an example for what a complete generics |
| proposal must cover. |
| |
| |
| ## Introduction |
| |
| This document describes a possible implementation of generalized types |
| in Go. |
| We introduce a new keyword, `gen`, which declares one or more type |
| parameters: types that are not known at compile time. |
| These type parameters may then be used in other declarations, |
| producing generalized types and functions. |
| |
| Some goals, borrowed from [Garcia et al](https://web.archive.org/web/20170812055356/http://www.crest.iu.edu/publications/prints/2003/comparing_generic_programming03.pdf): |
| |
| * Do not require an explicit relationship between a definition of a generalized function and its use. The function should be callable with any type that fits the required form. |
| * Permit interfaces to express relationships between types of methods, as in a comparison function that takes two parameters of the same unknown type. |
| * Given a generalized type, make it possible to use related types, such as a slice of that type. |
| * Do not require explicit instantiation of generalized functions. |
| * Permit type aliasing of generalized types. |
| |
| The type parameter introduced by a `gen` declaration is a concept that |
| exists at compile time. |
| Any actual value that exists at runtime has a specific concrete type: |
| an ordinary non-generalized type, or a generalized type that has been |
| instantiated as a concrete type. |
| Generalized functions will be compiled to handle values whose types |
| are supplied at runtime. |
| |
| This is what changes in the language: |
| |
| * There is a new syntax for declaring a type parameter (or parameters) for the scope of one or more declarations. |
| * There is a new syntax for specifying the concrete type(s) to use when using something declared with a type parameter. |
| * There is a new syntax for converting values of concrete type, and untyped constants, to generalized types. Also values of generalized type are permitted in type assertions. |
| * Within a function, we define the operations permitted on values with a generalized type. |
| |
| ## Syntax |
| |
| Any package-scope type or function declaration may be preceded with |
| the new keyword `gen` followed by a list of type parameter names in |
| square brackets: |
| |
| ``` |
| gen [T] type Vector []T |
| ``` |
| |
| This defines `T` as a type parameter for the generalized type `Vector`. |
| The scope of `Vector` is the same as it would be if `gen` did not appear. |
| |
| A use of a generalized type will normally provide specific types to |
| use for the type parameters. |
| This is done using square brackets following the generalized type. |
| |
| ``` |
| type VectorInt Vector[int] |
| var v1 Vector[int] |
| var v2 Vector[float32] |
| gen [T1, T2] type Pair struct { first T1; second T2 } |
| var v3 Pair[int, string] |
| ``` |
| |
| Type parameters may also be used with functions. |
| |
| ``` |
| gen [T] func SetHead(v Vector[T], e T) T { |
| v[0] = e |
| return e |
| } |
| ``` |
| |
| For convenience, we permit a modified version of the factoring syntax |
| used with `var`, `type`, and `const` to permit a series of |
| declarations to share the same type parameters. |
| |
| ``` |
| gen [T1, T2] ( |
| type Pair struct { first T1; second T2 } |
| |
| func MakePair(first T1, second T2) Pair { |
| return &Pair{first, second} |
| } |
| ) |
| ``` |
| |
| References to other names declared within the same gen block do not |
| have to specify the type parameters. |
| When the type parameters are omitted, they are assumed to simply be |
| the parameters declared for the block. |
| In the above example, `Pair` when used as the result type of |
| `MakePair` is equivalent to `Pair[T1, T2]`. |
| |
| As with generalized types, we must specify the types when we refer to |
| a generalized function (but see the section on type deduction, below). |
| |
| ``` |
| var MakeIntPair = MakePair[int, int] |
| var IntPairZero = MakeIntPair(0, 0) |
| ``` |
| |
| A generalized type can have methods. |
| |
| |
| ``` |
| gen [T] func (v *Vector[T]) SetHeadMethod(e T) T { |
| v[0] = e |
| return e |
| } |
| ``` |
| |
| Of course a method of a generalized type may itself be a generalized function. |
| |
| ``` |
| gen [T, T2] func (v *Vector[T]) Transform(f func(T) T2) Vector[T2] |
| ``` |
| |
| The `gen` keyword may only be used with a type or function. |
| It may only appear in package scope, not within a function. |
| One `gen` keyword may appear within the scope of another. |
| In that case, any use of the generalized type or function must specify |
| all the type parameters, starting with the outermost ones. |
| A different way of writing the last example would be: |
| |
| ``` |
| gen [T] ( |
| type Vector []T |
| gen [T2] func (v *Vector[T]) Transform(f func(T) T2) Vector[T2] |
| ) |
| |
| var v Vector[int] |
| var v2 = v.Transform[int, string](f) |
| ``` |
| |
| Type deduction, described below, would permit omitting the |
| `[int, string]` in the last line, based on the types of `v` and `f`. |
| |
| ### A note on syntax |
| |
| While the use of the `gen` keyword fits reasonably well into the |
| existing Go language, the use of square brackets to denote the |
| specific types is new to Go. |
| We have considered a number of different approaches: |
| |
| * Use angle brackets, as in `Pair<int, string>`. This has the advantage of being familiar to C++ and Java programmers. Unfortunately, it means that `f<T>(true)` can be parsed as either a call to function `f<T>` or a comparison of `f<T` (an expression that tests whether `f` is less than `T`) with `(true)`. While it may be possible to construct complex resolution rules, the Go syntax avoids that sort of ambiguity for good reason. |
| * Overload the dot operator again, as in `Vector.int` or `Pair.(int, string)`. This becomes confusing when we see `Vector.(int)`, which could be a type assertion. |
| * We considered using dot but putting the type first, as in `int.Vector` or `(int, string).Pair`. It might be possible to make that work without ambiguity, but putting the types first seems to make the code harder to read. |
| * An earlier version of this proposal used parentheses for names after types, as in `Vector(int)`. However, that proposal was flawed because there was no way to specify types for generalized functions, and extending the parentheses syntax led to `MakePair(int, string)(1, "")` which seems less than ideal. |
| * We considered various different characters, such as backslash, dollar sign, at-sign or sharp. The square brackets grouped the parameters nicely and provide an acceptable visual appearance. |
| |
| ## Type Deduction |
| |
| When calling a function, as opposed to referring to it without calling |
| it, the type parameters may be omitted in some cases. |
| A function call may omit the type parameters when every type parameter |
| is used for a regular parameter, or, in other words, there are no type |
| parameters that are used only for results. |
| In that case the compiler will compare the actual type of the |
| argument (`A`) with the type of the generalized parameter (`P`), examining |
| the arguments from left to right. |
| `A` and `P` must be identical. |
| The first time we see a type parameter in `P`, it will be set to the |
| appropriate portion of `A`. |
| If the type parameter appears again, it must be identical to the |
| actual type at that point. |
| |
| Note that at compile time the argument type may itself be a |
| generalized type. |
| The type deduction algorithm is the same. |
| A type parameter of `P` may match a type parameter of `A`. |
| Once this match is made, then every subsequent instance of the `P` type |
| parameter must match the same `A` type parameter. |
| |
| When doing type deduction with an untyped numeric constant, the |
| constant is given the type `int`, `float64`, or `complex128` as usual. |
| Type deduction does not support passing an untyped `nil` constant; |
| `nil` may only be used with an explicit type conversion (or, of course, |
| the type parameters may be written explicitly). |
| |
| For example, these two variables will have the same type and value. |
| |
| ``` |
| var v1 = MakePair[int, int](0, 0) |
| var v2 = MakePair(0, 0) // [int, int] deduced. |
| ``` |
| |
| ## Constraints |
| |
| The only things that a generalized function can do with a value of |
| generalized type are the operations inherent to the type—e.g., the |
| `Vector` type can be indexed or sliced. |
| But sometimes we want to be able to say something about the types that |
| are used as part of a larger type. |
| Specifically, we want to say that they must implement a particular |
| interface. |
| So when listing the identifiers following `gen` we permit an optional |
| interface type following the name. |
| |
| ``` |
| gen [T Stringer] type PrintableVector []T |
| ``` |
| |
| Now `PrintableVector` may only be used with types that implement the |
| `Stringer` interface. |
| |
| The interface may itself be a generalized type. |
| The scope of each type parameter starts at the `[`, and so we permit |
| using the type identifier just named with the generalized interface |
| type. |
| |
| ``` |
| // Types that may be compared with some other type. |
| gen [T] type Comparer interface { |
| Compare(T) int // <0, ==0, >0 |
| } |
| |
| // Vector of elements that may be compared with themselves. |
| gen [T Comparer[T]] type SortableVector []T |
| ``` |
| |
| ## Example |
| |
| ``` |
| package hashmap |
| |
| gen [Keytype, Valtype] ( |
| |
| type bucket struct { |
| next *bucket |
| key Keytype |
| val Valtype |
| } |
| |
| type Hashfn func(Keytype) uint |
| type Eqfn func(Keytype, Keytype) bool |
| |
| type Hashmap struct { |
| hashfn Hashfn |
| eqfn Eqfn |
| buckets []bucket |
| entries int |
| } |
| |
| // This function must be called with explicit type parameters, as |
| // there is no way to deduce the value type. For example, |
| // h := hashmap.New[int, string](hashfn, eqfn) |
| func New(hashfn Hashfn, eqfn Eqfn) *Hashmap { |
| return &Hashmap{hashfn, eqfn, make([]buckets, 16), 0} |
| } |
| |
| func (p *Hashmap) Lookup(key Keytype) (val Valtype, found bool) { |
| h := p.hashfn(key) % len(p.buckets) |
| for b := p.buckets[h]; b != nil; b = b.next { |
| if p.eqfn(key, b.key) { |
| return b.val, true |
| } |
| } |
| return |
| } |
| |
| func (p *Hashmap) Insert(key Keytype, val Valtype) (inserted bool) { |
| // Implementation omitted. |
| } |
| |
| ) // Ends gen. |
| |
| package sample |
| |
| import ( |
| “fmt” |
| “hashmap” |
| “os” |
| ) |
| |
| func hashint(i int) uint { |
| return uint(i) |
| } |
| |
| func eqint(i, j int) bool { |
| return i == j |
| } |
| |
| var v = hashmap.New[int, string](hashint, eqint) |
| |
| func Add(id int, name string) { |
| if !v.Insert(id, name) { |
| fmt.Println(“duplicate id”, id) |
| os.Exit(1) |
| } |
| } |
| |
| func Find(id int) string { |
| val, found = v.Lookup(id) |
| if !found { |
| fmt.Println(“missing id”, id) |
| os.Exit(1) |
| } |
| } |
| ``` |
| |
| ## Language spec changes |
| |
| This is an outline of the changes required to the language spec. |
| |
| ### Types |
| |
| A few paragraphs will be added to discuss generalized types. |
| |
| ### Struct types |
| |
| While a struct may use a type parameter as an anonymous field, within |
| generalized code only the generalized definition is considered when |
| resolving field references. |
| That is, given |
| |
| ``` |
| gen [T] type MyGenStruct struct { T } |
| type MyRealStruct { i int } |
| type MyInstStruct MyGenStruct[MyRealStruct] |
| gen [T] func GetI(p *MyGenStruct[T]) int { |
| return p.i // INVALID |
| } |
| func MyGetI(p *MyInstStruct) int { |
| return GetI(p) |
| } |
| ``` |
| |
| the function `GetI` may not refer to the field `i` even though the |
| field exists when called from `MyGetI`. |
| (This restriction is fairly obvious if you think about it, but is |
| explicitly stated for clarity.) |
| |
| ### Type Identity |
| |
| We define type identity for generalized types. |
| Two generalized types are identical if they have the same name and the |
| type parameters are identical. |
| |
| ### Assignability |
| |
| We define assignability for generalized types. |
| A value `x` of generalized type `T1` is assignable to a variable of |
| type `T2` if `T1` and `T2` are identical. |
| A value `x` of concrete type is never assignable to a variable of |
| generalized type: a generalized type coercion is required (see below). |
| Similarly, a value `x` of generalized type is never assignable to a |
| variable of concrete type: a type assertion is required. |
| For example (more details given below): |
| |
| ``` |
| gen [T] func Zero() (z T) { |
| z = 0 // INVALID: concrete to generalized. |
| z = int(0) // INVALID: concrete to generalized. |
| z = 0.[T] // Valid: generalized type coercion. |
| } |
| gen [T] func ToInt(v T) (r int) { |
| r = v // INVALID: generalized to concrete |
| r = int(v) // INVALID: no conversions for gen types |
| r, ok := v.(int) // Valid: generalized type assertion. |
| if !ok { |
| panic(“not int”) |
| } |
| } |
| ``` |
| |
| ### Declarations and scope |
| |
| A new section Generalized declarations is added, consisting of a few |
| paragraphs that describe generalized declarations and the gen syntax. |
| |
| ### Indexes |
| |
| The new syntax `x[T]` for a generalized type or function is defined, |
| where `T` is a type and `x` is the name of some type or function |
| declared within a `gen` scope. |
| |
| ### Type assertions |
| |
| We define type assertions using generalized types. |
| |
| Given `x.(T)` where `x` is a value with generalized type and `T` is a |
| concrete type, the type assertion succeeds if the concrete type of `x` |
| is identical to `T`, or, if `T` is an interface type, the concrete |
| type implements the interface `T`. |
| In other words, pretty much the same as doing a type assertion of a |
| value of interface type. |
| |
| If `x` and `T` are both generalized types, we do the same test using |
| the concrete types of `x` and `T`. |
| |
| In general these assertions must be checked at runtime. |
| |
| ### Generalized type coercions |
| |
| We introduce a new syntax for coercing a value of concrete type to a |
| generalized type. |
| Where `x` is a value with concrete type and `T` is a generalized type, |
| the expression `x.[T]` coerces `x` to the generalized type `T`. |
| The generalized type coercion may succeed or fail, just as with a type |
| assertion. |
| However, it is not a pure type assertion, as we permit `x` to be an |
| untyped constant. |
| The generalized type coercion succeeds if the concrete type matches |
| the generalized type, where any parameters of the generalized type |
| match the appropriate portion of the concrete type. |
| If the same parameter appears more than once in the generalized type, |
| it must match identical types in the concrete type. |
| If the value is an untyped constant, the coercion succeeds if an |
| assignment of that constant to the concrete type would succeed at |
| compile time. |
| |
| ### Calls |
| |
| This section is extended to describe the type deduction algorithm used |
| to avoid explicit type parameters when possible. |
| |
| An implicit generalized type conversion is applied to convert the |
| arguments to the expected generalized type, even though normally |
| values of concrete type are not assignable to variables of generalized |
| type. |
| Type checking ensures that the arguments must be assignable to the |
| concrete type which is either specified or deduced, and so this |
| implicit generalized type conversion will always succeed. |
| |
| When a result parameter has a generalized type, an implicit type |
| assertion is applied to convert back to the type that the caller |
| expects, which may be a concrete type. |
| The type expected by the caller is determined by the type parameters |
| passed to the function, whether determined via type deduction or not. |
| This implicit type assertion will always succeed. |
| For example, in |
| |
| ``` |
| gen [T] func Identity(v T) T { return v } |
| func Call(i int) { j := Identity(i) } |
| ``` |
| |
| the variable `j` gets the type `int`, and an implicit type assertion |
| converts the return value of `Identity[int]` to `int`. |
| |
| ### Conversions |
| |
| Nothing needs to change in this section. |
| I just want to note explicitly that there are no type conversions for |
| generalized types other than the standard conversions that apply to |
| all types. |
| |
| ### Type switches |
| |
| A type switch may be used on a value of generalized type. |
| Type switch cases may include generalized types. |
| The rules are the same as for type assertions. |
| |
| ### For statements |
| |
| A range clause may be used with a value of generalized type, if the |
| generalized type is known to be a slice, array, map or channel. |
| |
| ## Implementation |
| |
| Any actual value in Go will have a concrete type. |
| The implementation issue that arises is how to compile a function that |
| has parameters with generalized type. |
| |
| ### Representation |
| |
| When calling a function that uses type parameters, the type parameters |
| are passed first, as pointers to a runtime type descriptor. |
| The type parameters are thus literally additional parameters to the |
| functions. |
| |
| ### Types |
| |
| In some cases it will be necessary to create a new type at runtime, |
| which means creating a new runtime type descriptor. |
| It will be necessary to ensure that type descriptor comparisons |
| continue to work correctly. |
| For example, the hashmap example above will require creating a new |
| type for each call to `hashmap.New` for the concrete types that are used |
| in the call. |
| The reflect package already creates new runtime type descriptors in |
| the functions `PtrTo`, `ChanOf`, `FuncOf`, etc. |
| |
| Type reflection on a generalized type will return the appropriate |
| runtime type descriptor, which may have been newly created. |
| Calling `Name()` on such a type descriptor will return a name with the |
| appropriate type parameters: e.g, `“Vector[int]”`. |
| |
| ### Variable declarations |
| |
| A local variable in a function may be declared with a generalized |
| type. |
| In the general case, the size of the variable is unknown, and must be |
| retrieved from the type descriptor. |
| Declaring a local variable of unknown size will dynamically allocate |
| zeroed memory of the appropriate size. |
| As an optimization the memory may be allocated on the stack when there |
| is sufficient room. |
| |
| ### Composite literals |
| |
| A generalized type that is defined to be a struct, array, slice, or |
| map type may be used to create a composite literal. |
| The expression has the same generalized type. |
| The elements of the composite literal must follow the assignability |
| rules. |
| |
| ### Selectors |
| |
| When `x` is a value of generalized type that is a struct, `x.f` can |
| refer to a field of that struct. |
| Whether `f` is a field of `x` is known at compile time. |
| The exact offset of the field in the struct value may not be known. |
| When it is not known, the field offset is retrieved from the type |
| descriptor at runtime. |
| |
| Similarly, `x.f` may refer to a method of the type. |
| In this case the method is always known at compile time. |
| |
| As noted above under struct types, if a generalized struct type uses a |
| type parameter as an anonymous field, the compiler does not attempt to |
| look up a field name in the concrete type of the field at runtime. |
| |
| ### Indexes |
| |
| A value of a generalized type that is an array, slice or map may be indexed. |
| Note that when indexing into a map type, the type of the value must be |
| assignable to the map’s key type; |
| in practice this means that if the map’s key type is generalized, the |
| value must itself have the same generalized type. |
| Indexing into a generalized array or slice may require multiplying by |
| the element size found in the type descriptor. |
| Indexing into a generalized map may require a new runtime function. |
| |
| ### Slices |
| |
| A value of a generalized type that is an array or slice may itself be |
| sliced. |
| This operation is essentially the same as a slice of a value of |
| concrete type. |
| |
| ### Type Assertions |
| |
| A type assertion generally requires a runtime check, and in the |
| general case requires comparing two concrete types at runtime, where |
| one of the types is known to instantiate some generalized type. |
| The complexity of the runtime check is linear in the number of tokens |
| in the generalized type, and requires storage space to store type |
| parameters during the check. |
| This check could be inlined into the code, or it could use a general |
| purpose runtime check that compares the concrete type descriptor to a |
| similar representation of the generalized type. |
| |
| ### Calls |
| |
| Function calls can require converting normal values to generalized |
| values. |
| This operation depends on the representation chosen for the |
| generalized value. |
| In the worst case it will be similar to passing a normal value to a |
| function that takes an interface type. |
| When calling a function with type parameters, the type parameters will |
| be passed first, as a pointer to a runtime type descriptor. |
| |
| Function calls can also require converting generalized return values |
| to normal values. |
| This is done via an implicitly inserted type assertion. |
| Depending on the representation, this may not require any actual code |
| to be generated. |
| |
| ### Communication operators |
| |
| We have to implement sending and receiving generalized values for |
| channels of generalized type. |
| |
| ### Assignments |
| |
| We have to implement assignment of generalized values. |
| This will be based on the runtime type descriptor. |
| |
| ### Type switches |
| |
| We have to implement type switches using generalized types. |
| This will mostly likely devolve into a series of if statements using |
| type assertions. |
| |
| ### For statements |
| |
| We have to implement for statements with range clauses over |
| generalized types. |
| This is similar to the indexing and communication operators. |
| |
| ### Select statements |
| |
| We have to implement select on channels of generalized type. |
| |
| ### Return statements |
| |
| We have to implement returning a value of generalized type. |
| |
| ### Specialization of functions |
| |
| This proposal is intended to support compiling a generalized function |
| into code that operates on generalized values. |
| In fact, it requires that this work. |
| |
| ``` |
| package p1 |
| gen [T] func Call(f func (T) T, T) T { |
| return f(T) |
| } |
| |
| package p2 |
| func SliceIdentity(a []int) []int { |
| return a |
| } |
| |
| package p3 |
| var v = p1.Call(p2.SliceIdentity, make([]int, 10)) |
| ``` |
| |
| Here `Call` has to support calling a generalized function. |
| There is no straightforward specialization process that can implement |
| this case. |
| (It could be done if the full source code of p1 and p2 are available either when compiling p3 or at link time; |
| that is how C++ does it, but it is not an approach that fits well with Go.) |
| |
| However, for many cases, this proposal can be implemented using |
| function specialization. |
| Whenever the compiler can use type deduction for a function call, and |
| the types are known concrete types, and the body of the function is |
| available, the compiler can generate a version of the function |
| specialized for those types. |
| This is, therefore, an optional optimization, in effect a form of |
| cross-package inlining, which costs compilation time but improves |
| runtime. |
| |
| ## Methods on builtin types |
| |
| This is an optional addendum to the proposal described above. |
| |
| The proposal does not provide a convenient way to write a function |
| that works on any numeric type. |
| For example, there is no convenient way to write this: |
| |
| ``` |
| gen [T] func SliceAverage(a []T) T { |
| s := T(0) |
| for _, v = range a { |
| s += v |
| } |
| return s / len(a) |
| } |
| ``` |
| |
| It would be nice if that function worked for any numeric function. |
| However, it is not permitted under the proposal described above, |
| because of the use of `+=` and `/`. |
| These operators are not available for every type and therefore are not |
| available for a generalized type. |
| |
| This approach does work: |
| |
| ``` |
| gen [T] type Number interface { |
| Plus(T) T |
| Divide(T) T |
| } |
| |
| gen [T Number[T]] func SliceAverage(a []T) T { |
| s := 0.[T] |
| for _, v = range a { |
| s = s.Plus(v) |
| } |
| return s.Divide(len(a)) |
| } |
| ``` |
| |
| However, this requires writing explicit `Plus` and `Divide` methods for |
| each type you want to use. |
| These methods are themselves boilerplate: |
| |
| ``` |
| func (i MyNum) Plus(v MyNum) MyNum { return i + v } |
| func (i MyNum) Divide(v MyNum) MyNum { return i / v } |
| ``` |
| |
| This proposal does not help with this kind of boilerplate function, |
| because there is no way to use operators with generalized values. |
| |
| There are a few ways to solve this. |
| One way that seems to fit well with Go as extended by this proposal is |
| to declare that for all types that support some language operator, the |
| type has a corresponding method. |
| That is, we say that if the type can be used with `+`, the language |
| defines a method `Plus` (or `Binary+` or whatever) for the type that |
| implements the operation. |
| This method can then be picked up by an interface such as the above, |
| and the standard library can define convenient aggregate interfaces, |
| such as an interface listing all the methods supported by an integer |
| type. |
| |
| Note that it would not help for the standard library to define a |
| `Plus` method for every integer type, as those methods would not carry |
| over to user defined types. |
| |
| ## Operator methods |
| |
| It is of course a smallish step from those language-defined methods to |
| having operator methods, which would permit writing generalized code |
| using operators rather than method calls. For the purposes of using |
| generalized types, however, this is less important than having |
| language defined methods for operators. |
| |
| ## Summary |
| |
| This proposal will not be adopted. |
| It has significant flaws. |
| |
| The factored `gen` syntax is convenient but looks awkward on the page. |
| You wind up with a trailing close parenthesis after a set of function |
| definitions. |
| Indenting all the function definitions looks silly. |
| |
| This proposal doesn't let me write a trivial generalized `Max` |
| function, unless we include operator methods. |
| Even when we include operator methods, `Max` has to be written in |
| terms of a `Less` method. |
| |
| The handling of untyped constants in generalized functions is |
| extremely awkward. |
| They must always use a generalized type coercion. |
| |
| While this proposal is more or less palatable for data structures, |
| it is much weaker for functions. |
| You basically can't do anything with a generalized type, |
| except assign it and call a method on it. |
| Writing standardized algorithms will require developing a whole |
| vocabulary of quasi-standard methods. |
| |
| The proposal doesn't help write functions that work on either `[]byte` |
| or `string`, unless those types get additional operator methods like |
| `Index` and `Len`. |
| Even operator methods don't help with using `range`. |