Generalized Types

This is a proposal for adding generics to Go, written by Ian Lance Taylor in March, 2011. This proposal will not be adopted. It is being presented as an example for what a complete generics proposal must cover.

Introduction

This document describes a possible implementation of generalized types in Go. We introduce a new keyword, gen, which declares one or more type parameters: types that are not known at compile time. These type parameters may then be used in other declarations, producing generalized types and functions.

Some goals, borrowed from Garcia et al:

  • Do not require an explicit relationship between a definition of a generalized function and its use. The function should be callable with any type that fits the required form.
  • Permit interfaces to express relationships between types of methods, as in a comparison function that takes two parameters of the same unknown type.
  • Given a generalized type, make it possible to use related types, such as a slice of that type.
  • Do not require explicit instantiation of generalized functions.
  • Permit type aliasing of generalized types.

The type parameter introduced by a gen declaration is a concept that exists at compile time. Any actual value that exists at runtime has a specific concrete type: an ordinary non-generalized type, or a generalized type that has been instantiated as a concrete type. Generalized functions will be compiled to handle values whose types are supplied at runtime.

This is what changes in the language:

  • There is a new syntax for declaring a type parameter (or parameters) for the scope of one or more declarations.
  • There is a new syntax for specifying the concrete type(s) to use when using something declared with a type parameter.
  • There is a new syntax for converting values of concrete type, and untyped constants, to generalized types. Also values of generalized type are permitted in type assertions.
  • Within a function, we define the operations permitted on values with a generalized type.

Syntax

Any package-scope type or function declaration may be preceded with the new keyword gen followed by a list of type parameter names in square brackets:

gen [T] type Vector []T

This defines T as a type parameter for the generalized type Vector. The scope of Vector is the same as it would be if gen did not appear.

A use of a generalized type will normally provide specific types to use for the type parameters. This is done using square brackets following the generalized type.

type VectorInt Vector[int]
var v1 Vector[int]
var v2 Vector[float32]
gen [T1, T2] type Pair struct { first T1; second T2 }
var v3 Pair[int, string]

Type parameters may also be used with functions.

gen [T] func SetHead(v Vector[T], e T) T {
	v[0] = e
	return e
}

For convenience, we permit a modified version of the factoring syntax used with var, type, and const to permit a series of declarations to share the same type parameters.

gen [T1, T2] (
type Pair struct { first T1; second T2 }

func MakePair(first T1, second T2) Pair {
	return &Pair{first, second}
}
)

References to other names declared within the same gen block do not have to specify the type parameters. When the type parameters are omitted, they are assumed to simply be the parameters declared for the block. In the above example, Pair when used as the result type of MakePair is equivalent to Pair[T1, T2].

As with generalized types, we must specify the types when we refer to a generalized function (but see the section on type deduction, below).

var MakeIntPair = MakePair[int, int]
var IntPairZero = MakeIntPair(0, 0)

A generalized type can have methods.

gen [T] func (v *Vector[T]) SetHeadMethod(e T) T {
	v[0] = e
	return e
}

Of course a method of a generalized type may itself be a generalized function.

gen [T, T2] func (v *Vector[T]) Transform(f func(T) T2) Vector[T2]

The gen keyword may only be used with a type or function. It may only appear in package scope, not within a function. One gen keyword may appear within the scope of another. In that case, any use of the generalized type or function must specify all the type parameters, starting with the outermost ones. A different way of writing the last example would be:

gen [T] (
type Vector []T
gen [T2] func (v *Vector[T]) Transform(f func(T) T2) Vector[T2]
)

var v Vector[int]
var v2 = v.Transform[int, string](f)

Type deduction, described below, would permit omitting the [int, string] in the last line, based on the types of v and f.

A note on syntax

While the use of the gen keyword fits reasonably well into the existing Go language, the use of square brackets to denote the specific types is new to Go. We have considered a number of different approaches:

  • Use angle brackets, as in Pair<int, string>. This has the advantage of being familiar to C++ and Java programmers. Unfortunately, it means that f<T>(true) can be parsed as either a call to function f<T> or a comparison of f<T (an expression that tests whether f is less than T) with (true). While it may be possible to construct complex resolution rules, the Go syntax avoids that sort of ambiguity for good reason.
  • Overload the dot operator again, as in Vector.int or Pair.(int, string). This becomes confusing when we see Vector.(int), which could be a type assertion.
  • We considered using dot but putting the type first, as in int.Vector or (int, string).Pair. It might be possible to make that work without ambiguity, but putting the types first seems to make the code harder to read.
  • An earlier version of this proposal used parentheses for names after types, as in Vector(int). However, that proposal was flawed because there was no way to specify types for generalized functions, and extending the parentheses syntax led to MakePair(int, string)(1, "") which seems less than ideal.
  • We considered various different characters, such as backslash, dollar sign, at-sign or sharp. The square brackets grouped the parameters nicely and provide an acceptable visual appearance.

Type Deduction

When calling a function, as opposed to referring to it without calling it, the type parameters may be omitted in some cases. A function call may omit the type parameters when every type parameter is used for a regular parameter, or, in other words, there are no type parameters that are used only for results. In that case the compiler will compare the actual type of the argument (A) with the type of the generalized parameter (P), examining the arguments from left to right. A and P must be identical. The first time we see a type parameter in P, it will be set to the appropriate portion of A. If the type parameter appears again, it must be identical to the actual type at that point.

Note that at compile time the argument type may itself be a generalized type. The type deduction algorithm is the same. A type parameter of P may match a type parameter of A. Once this match is made, then every subsequent instance of the P type parameter must match the same A type parameter.

When doing type deduction with an untyped numeric constant, the constant is given the type int, float64, or complex128 as usual. Type deduction does not support passing an untyped nil constant; nil may only be used with an explicit type conversion (or, of course, the type parameters may be written explicitly).

For example, these two variables will have the same type and value.

var v1 = MakePair[int, int](0, 0)
var v2 = MakePair(0, 0)			// [int, int] deduced.

Constraints

The only things that a generalized function can do with a value of generalized type are the operations inherent to the type—e.g., the Vector type can be indexed or sliced. But sometimes we want to be able to say something about the types that are used as part of a larger type. Specifically, we want to say that they must implement a particular interface. So when listing the identifiers following gen we permit an optional interface type following the name.

gen [T Stringer] type PrintableVector []T

Now PrintableVector may only be used with types that implement the Stringer interface.

The interface may itself be a generalized type. The scope of each type parameter starts at the [, and so we permit using the type identifier just named with the generalized interface type.

// Types that may be compared with some other type.
gen [T] type Comparer interface {
	Compare(T) int	// <0, ==0, >0
}

// Vector of elements that may be compared with themselves.
gen [T Comparer[T]] type SortableVector []T

Example

package hashmap

gen [Keytype, Valtype] (

type bucket struct {
	next *bucket
	key Keytype
	val Valtype
}

type Hashfn func(Keytype) uint
type Eqfn func(Keytype, Keytype) bool

type Hashmap struct {
	hashfn Hashfn
	eqfn Eqfn
	buckets []bucket
	entries int
}

// This function must be called with explicit type parameters, as
// there is no way to deduce the value type. For example,
// h := hashmap.New[int, string](hashfn, eqfn)
func New(hashfn Hashfn, eqfn Eqfn) *Hashmap {
	return &Hashmap{hashfn, eqfn, make([]buckets, 16), 0}
}

func (p *Hashmap) Lookup(key Keytype) (val Valtype, found bool) {
	h := p.hashfn(key) % len(p.buckets)
	for b := p.buckets[h]; b != nil; b = b.next {
		if p.eqfn(key, b.key) {
			return b.val, true
		}
	}
	return
}

func (p *Hashmap) Insert(key Keytype, val Valtype) (inserted bool) {
	// Implementation omitted.
}

) // Ends gen.

package sample

import (
“fmt”
“hashmap”
“os”
)

func hashint(i int) uint {
	return uint(i)
}

func eqint(i, j int) bool {
	return i == j
}

var v = hashmap.New[int, string](hashint, eqint)

func Add(id int, name string) {
	if !v.Insert(id, name) {
		fmt.Println(“duplicate id”, id)
		os.Exit(1)
	}
}

func Find(id int) string {
	val, found = v.Lookup(id)
	if !found {
		fmt.Println(“missing id”, id)
		os.Exit(1)
	}
}

Language spec changes

This is an outline of the changes required to the language spec.

Types

A few paragraphs will be added to discuss generalized types.

Struct types

While a struct may use a type parameter as an anonymous field, within generalized code only the generalized definition is considered when resolving field references. That is, given

gen [T] type MyGenStruct struct { T }
type MyRealStruct { i int }
type MyInstStruct MyGenStruct[MyRealStruct]
gen [T] func GetI(p *MyGenStruct[T]) int {
	return p.i	// INVALID
}
func MyGetI(p *MyInstStruct) int {
	return GetI(p)
}

the function GetI may not refer to the field i even though the field exists when called from MyGetI. (This restriction is fairly obvious if you think about it, but is explicitly stated for clarity.)

Type Identity

We define type identity for generalized types. Two generalized types are identical if they have the same name and the type parameters are identical.

Assignability

We define assignability for generalized types. A value x of generalized type T1 is assignable to a variable of type T2 if T1 and T2 are identical. A value x of concrete type is never assignable to a variable of generalized type: a generalized type coercion is required (see below). Similarly, a value x of generalized type is never assignable to a variable of concrete type: a type assertion is required. For example (more details given below):

gen [T] func Zero() (z T) {
	z = 0			 // INVALID: concrete to generalized.
	z = int(0)		 // INVALID: concrete to generalized.
	z = 0.[T]		 // Valid: generalized type coercion.
}
gen [T] func ToInt(v T) (r int) {
	r = v			 // INVALID: generalized to concrete
	r = int(v)		 // INVALID: no conversions for gen types
	r, ok := v.(int) // Valid: generalized type assertion.
	if !ok {
		panic(“not int”)
	}
}

Declarations and scope

A new section Generalized declarations is added, consisting of a few paragraphs that describe generalized declarations and the gen syntax.

Indexes

The new syntax x[T] for a generalized type or function is defined, where T is a type and x is the name of some type or function declared within a gen scope.

Type assertions

We define type assertions using generalized types.

Given x.(T) where x is a value with generalized type and T is a concrete type, the type assertion succeeds if the concrete type of x is identical to T, or, if T is an interface type, the concrete type implements the interface T. In other words, pretty much the same as doing a type assertion of a value of interface type.

If x and T are both generalized types, we do the same test using the concrete types of x and T.

In general these assertions must be checked at runtime.

Generalized type coercions

We introduce a new syntax for coercing a value of concrete type to a generalized type. Where x is a value with concrete type and T is a generalized type, the expression x.[T] coerces x to the generalized type T. The generalized type coercion may succeed or fail, just as with a type assertion. However, it is not a pure type assertion, as we permit x to be an untyped constant. The generalized type coercion succeeds if the concrete type matches the generalized type, where any parameters of the generalized type match the appropriate portion of the concrete type. If the same parameter appears more than once in the generalized type, it must match identical types in the concrete type. If the value is an untyped constant, the coercion succeeds if an assignment of that constant to the concrete type would succeed at compile time.

Calls

This section is extended to describe the type deduction algorithm used to avoid explicit type parameters when possible.

An implicit generalized type conversion is applied to convert the arguments to the expected generalized type, even though normally values of concrete type are not assignable to variables of generalized type. Type checking ensures that the arguments must be assignable to the concrete type which is either specified or deduced, and so this implicit generalized type conversion will always succeed.

When a result parameter has a generalized type, an implicit type assertion is applied to convert back to the type that the caller expects, which may be a concrete type. The type expected by the caller is determined by the type parameters passed to the function, whether determined via type deduction or not. This implicit type assertion will always succeed. For example, in

gen [T] func Identity(v T) T { return v }
func Call(i int) { j := Identity(i) }

the variable j gets the type int, and an implicit type assertion converts the return value of Identity[int] to int.

Conversions

Nothing needs to change in this section. I just want to note explicitly that there are no type conversions for generalized types other than the standard conversions that apply to all types.

Type switches

A type switch may be used on a value of generalized type. Type switch cases may include generalized types. The rules are the same as for type assertions.

For statements

A range clause may be used with a value of generalized type, if the generalized type is known to be a slice, array, map or channel.

Implementation

Any actual value in Go will have a concrete type. The implementation issue that arises is how to compile a function that has parameters with generalized type.

Representation

When calling a function that uses type parameters, the type parameters are passed first, as pointers to a runtime type descriptor. The type parameters are thus literally additional parameters to the functions.

Types

In some cases it will be necessary to create a new type at runtime, which means creating a new runtime type descriptor. It will be necessary to ensure that type descriptor comparisons continue to work correctly. For example, the hashmap example above will require creating a new type for each call to hashmap.New for the concrete types that are used in the call. The reflect package already creates new runtime type descriptors in the functions PtrTo, ChanOf, FuncOf, etc.

Type reflection on a generalized type will return the appropriate runtime type descriptor, which may have been newly created. Calling Name() on such a type descriptor will return a name with the appropriate type parameters: e.g, “Vector[int]”.

Variable declarations

A local variable in a function may be declared with a generalized type. In the general case, the size of the variable is unknown, and must be retrieved from the type descriptor. Declaring a local variable of unknown size will dynamically allocate zeroed memory of the appropriate size. As an optimization the memory may be allocated on the stack when there is sufficient room.

Composite literals

A generalized type that is defined to be a struct, array, slice, or map type may be used to create a composite literal. The expression has the same generalized type. The elements of the composite literal must follow the assignability rules.

Selectors

When x is a value of generalized type that is a struct, x.f can refer to a field of that struct. Whether f is a field of x is known at compile time. The exact offset of the field in the struct value may not be known. When it is not known, the field offset is retrieved from the type descriptor at runtime.

Similarly, x.f may refer to a method of the type. In this case the method is always known at compile time.

As noted above under struct types, if a generalized struct type uses a type parameter as an anonymous field, the compiler does not attempt to look up a field name in the concrete type of the field at runtime.

Indexes

A value of a generalized type that is an array, slice or map may be indexed. Note that when indexing into a map type, the type of the value must be assignable to the map’s key type; in practice this means that if the map’s key type is generalized, the value must itself have the same generalized type. Indexing into a generalized array or slice may require multiplying by the element size found in the type descriptor. Indexing into a generalized map may require a new runtime function.

Slices

A value of a generalized type that is an array or slice may itself be sliced. This operation is essentially the same as a slice of a value of concrete type.

Type Assertions

A type assertion generally requires a runtime check, and in the general case requires comparing two concrete types at runtime, where one of the types is known to instantiate some generalized type. The complexity of the runtime check is linear in the number of tokens in the generalized type, and requires storage space to store type parameters during the check. This check could be inlined into the code, or it could use a general purpose runtime check that compares the concrete type descriptor to a similar representation of the generalized type.

Calls

Function calls can require converting normal values to generalized values. This operation depends on the representation chosen for the generalized value. In the worst case it will be similar to passing a normal value to a function that takes an interface type. When calling a function with type parameters, the type parameters will be passed first, as a pointer to a runtime type descriptor.

Function calls can also require converting generalized return values to normal values. This is done via an implicitly inserted type assertion. Depending on the representation, this may not require any actual code to be generated.

Communication operators

We have to implement sending and receiving generalized values for channels of generalized type.

Assignments

We have to implement assignment of generalized values. This will be based on the runtime type descriptor.

Type switches

We have to implement type switches using generalized types. This will mostly likely devolve into a series of if statements using type assertions.

For statements

We have to implement for statements with range clauses over generalized types. This is similar to the indexing and communication operators.

Select statements

We have to implement select on channels of generalized type.

Return statements

We have to implement returning a value of generalized type.

Specialization of functions

This proposal is intended to support compiling a generalized function into code that operates on generalized values. In fact, it requires that this work.

package p1
gen [T] func Call(f func (T) T, T) T {
	return f(T)
}

package p2
func SliceIdentity(a []int) []int {
	return a
}

package p3
var v = p1.Call(p2.SliceIdentity, make([]int, 10))

Here Call has to support calling a generalized function. There is no straightforward specialization process that can implement this case. (It could be done if the full source code of p1 and p2 are available either when compiling p3 or at link time; that is how C++ does it, but it is not an approach that fits well with Go.)

However, for many cases, this proposal can be implemented using function specialization. Whenever the compiler can use type deduction for a function call, and the types are known concrete types, and the body of the function is available, the compiler can generate a version of the function specialized for those types. This is, therefore, an optional optimization, in effect a form of cross-package inlining, which costs compilation time but improves runtime.

Methods on builtin types

This is an optional addendum to the proposal described above.

The proposal does not provide a convenient way to write a function that works on any numeric type. For example, there is no convenient way to write this:

gen [T] func SliceAverage(a []T) T {
	s := T(0)
	for _, v = range a {
		s += v
	}
	return s / len(a)
}

It would be nice if that function worked for any numeric function. However, it is not permitted under the proposal described above, because of the use of += and /. These operators are not available for every type and therefore are not available for a generalized type.

This approach does work:

gen [T] type Number interface {
	Plus(T) T
	Divide(T) T
}

gen [T Number[T]] func SliceAverage(a []T) T {
	s := 0.[T]
	for _, v = range a {
		s = s.Plus(v)
	}
	return s.Divide(len(a))
}

However, this requires writing explicit Plus and Divide methods for each type you want to use. These methods are themselves boilerplate:

func (i MyNum) Plus(v MyNum) MyNum { return i + v }
func (i MyNum) Divide(v MyNum) MyNum { return i / v }

This proposal does not help with this kind of boilerplate function, because there is no way to use operators with generalized values.

There are a few ways to solve this. One way that seems to fit well with Go as extended by this proposal is to declare that for all types that support some language operator, the type has a corresponding method. That is, we say that if the type can be used with +, the language defines a method Plus (or Binary+ or whatever) for the type that implements the operation. This method can then be picked up by an interface such as the above, and the standard library can define convenient aggregate interfaces, such as an interface listing all the methods supported by an integer type.

Note that it would not help for the standard library to define a Plus method for every integer type, as those methods would not carry over to user defined types.

Operator methods

It is of course a smallish step from those language-defined methods to having operator methods, which would permit writing generalized code using operators rather than method calls. For the purposes of using generalized types, however, this is less important than having language defined methods for operators.

Summary

This proposal will not be adopted. It has significant flaws.

The factored gen syntax is convenient but looks awkward on the page. You wind up with a trailing close parenthesis after a set of function definitions. Indenting all the function definitions looks silly.

This proposal doesn't let me write a trivial generalized Max function, unless we include operator methods. Even when we include operator methods, Max has to be written in terms of a Less method.

The handling of untyped constants in generalized functions is extremely awkward. They must always use a generalized type coercion.

While this proposal is more or less palatable for data structures, it is much weaker for functions. You basically can't do anything with a generalized type, except assign it and call a method on it. Writing standardized algorithms will require developing a whole vocabulary of quasi-standard methods.

The proposal doesn‘t help write functions that work on either []byte or string, unless those types get additional operator methods like Index and Len. Even operator methods don’t help with using range.