design: add proposal for parameterized generic array sizes

Detailed proposal for golang/go#44253.

Change-Id: I9e37363a4d5c41de100def1222d50ff6ad09d078
GitHub-Last-Rev: 8e3bc882aca345c889bc9d810b63cc40129de7cd
GitHub-Pull-Request: golang/proposal#33
Reviewed-on: https://go-review.googlesource.com/c/proposal/+/301290
Reviewed-by: Ian Lance Taylor <iant@golang.org>
diff --git a/design/44253-generic-array-sizes.md b/design/44253-generic-array-sizes.md
new file mode 100644
index 0000000..9ea4e25
--- /dev/null
+++ b/design/44253-generic-array-sizes.md
@@ -0,0 +1,258 @@
+# Proposal: Generic parameterization of array sizes
+
+Author(s): Andrew Werner
+
+Last updated: March 16th, 2021
+
+## Abstract
+
+With the type parameters generics proposal has been accepted, though not yet
+fully specified or implemented, we can begin to talk about extension. [That
+proposal][type parameters] lists the following omission:
+
+> No parameterization on non-type values such as constants. This arises most
+obviously for arrays, where it might sometimes be convenient to write type
+`Matrix[n int] [n][n]float64`. It might also sometimes be useful to specify
+significant values for a container type, such as a default value for elements.
+
+This proposal seeks to resolve this limitation by (a) specifying when `len` can
+be used as a compile-time constant and (b) adding syntax to specify constraints
+for all arrays of a given type in type lists.
+
+## Background
+
+An important property of the generics proposal is that it enables the creation
+of libraries of specialized container data structures. The existence of such
+libraries will help developers write more efficient code as these data
+structures will be able to allocate fewer object and provide greater access
+locality. [This Google blog post][block based data structures] about block-based
+C++ data drives home the point.
+
+The justification is laid out in the omission of the type parameter proposal.
+The motivation that I've stumbled upon is in trying to implement a B-Tree
+and allowing the client to dictate the degree of the node.
+
+One initial idea would be to allow the client to provide the actual array
+which will be backing the data inside the node as a type parameter. This might
+actually be okay in some data structure user cases but in a B-Tree it's bad
+because we still would like to instantiate an array for the pointers and that
+array needs to have a size that is a function of the data array.
+
+The proposal here seeks to make it possible for clients to provide default
+values for array sizes of generic data structures in a way that is minimally
+invasive to the concepts which go already has. The shorthand comment stated
+in the Omission of the Type Parameter Proposal waves its hand at what feels
+like a number of new and complex concepts for the language.
+
+## Proposal
+
+This proposals attempts to side-step questions of how one might provide a
+scalar value in a type constraint by not ever providing a scalar directly.
+This proposal recognizes that constants can be used to specify array lengths.
+It also notes that the value of `len()` can be computed as a compile-time
+constant in some cases. Lastly, it observes that type lists could be extended
+minimally to indicate a constraint that a type is an array of a given type
+without constraining the length of the array.
+
+### The vanilla generic B-Tree
+
+Let's explore the example of a generic B-Tree with a fixed-size buffer. Find
+such an example [here][vanilla btree].
+
+```go
+// These constants are the wart.
+const (
+	degree   = 16
+	maxItems = 2*degree - 1
+	minItems = degree - 1
+)
+
+func NewBTree[K, V any](cmp LessFn[K]) OrderedMap[K, V] {
+	return &btree[K, V]{cmp: cmp}
+}
+
+type btree[K, V any] struct {
+	cmp  LessFn[K]
+	root *node[K, V]
+}
+
+// ...
+
+type node[K, V any] struct {
+	count    int16
+	leaf     bool
+	keys     [maxItems]K
+	vals     [maxItems]V
+	children [maxItems + 1]*node[K, V]
+}
+```
+
+### Parameterized nodes
+
+Then we allow parameterization on the node type within the btree implementation
+so that different node concrete types with different memory layouts may be
+used. Find an example of this generalization
+[here][parameterized node btree].
+
+```go
+type nodeI[K, V, N any] interface {
+	type *N
+	find(K, LessFn[K]) (idx int, found bool)
+	insert(K, V, LessFn[K]) (replaced bool)
+	remove(K, LessFn[K]) (K, V, bool)
+	len() int16
+	at(idx int16) (K, V)
+	child(idx int16) *N
+	isLeaf() bool
+}
+
+func NewBTree[K, V any](cmp LessFn[K]) OrderedMap[K, V] {
+	type N = node[K, V]
+	return &btree[K, V, N, *N]{
+		cmp: cmp,
+		newNode: func(isLeaf bool) *N {
+			return &N{leaf: isLeaf}
+		},
+	}
+}
+
+type btree[K, V, N any, NP nodeI[K, V, N]] struct {
+	len     int
+	cmp     LessFn[K]
+	root    NP
+	newNode func(isLeaf bool) NP
+}
+
+type node[K, V any] struct {
+	count    int16
+	leaf     bool
+	keys     [maxItems]K
+	vals     [maxItems]V
+	children [maxItems + 1]*node[K, V]
+}
+```
+
+This still ends up using constants and there's no really easy
+way around that. You might want to parameterize the arrays into the node like
+in [this example][bad parameterization btree]. This still
+doesn't tell a story about how to relate the children array to the items.
+
+### The proposal to parameterize the arrays
+
+Instead, we'd like to find a way to express the idea that there's a size
+constant which can be used in the type definitions. The proposal would
+result in an implementation that looked like
+[this][proposal btree].
+
+```go
+
+// StructArr is a constraint that says that a type is an array of empty
+// structs of any length.
+type StructArr interface {
+	type [...]struct{}
+}
+
+type btree[K, V, N any, NP nodeI[K, V, N]] struct {
+	len     int
+	cmp     LessFn[K]
+	root    NP
+	newNode func(isLeaf bool) NP
+}
+
+// NewBTree constructs a generic BTree-backed map with degree 16.
+func NewBTree[K, V any](cmp LessFn[K]) OrderedMap[K, V] {
+	const defaultDegree = 16
+	return NewBTreeWithDegree[K, V, [defaultDegree]struct{}](cmp)
+}
+
+// NewBTreeWithDegree constructs a generic BTree-backed map with degree equal
+// to the length of the array used as type parameter A.
+func NewBTreeWithDegree[K, V any, A StructArr](cmp LessFn[K]) OrderedMap[K, V] {
+	type N = node[K, V, A]
+	return &btree[K, V, N, *N]{
+		cmp: cmp,
+		newNode: func(isLeaf bool) *N {
+			return &N{leaf: isLeaf}
+		},
+	}
+}
+
+type node[K, V any, A StructArr] struct {
+	count    int16
+	leaf     bool
+	keys     [2*len(A) - 1]K
+	values   [2*len(A) - 1]V
+	children [2 * len(A)]*node[K, V, A]
+}
+```
+### The Matrix example
+
+The example of the omission in type parameter proposal could be achieved in
+the following way:
+
+```go
+type Dim interface {
+    type [...]struct{}
+}
+
+type SquareFloatMatrix2D[D Dim] [len(D)][len(D)]float64
+```
+
+### Summary
+
+1) Support type list constraints to express that a type is an array
+
+
+```go
+// Array expresses a constraint that a type is an array of T of any
+// size.
+type Array[T any] interface {
+    type [...]T
+}
+```
+
+2)  Support a compile-time constant expression for `len([...]T)`
+
+This handy syntax would permit parameterization of arrays relative to other
+array types. Note that the constant expression `len` function on array types
+could actually be implemented today using `unsafe.Sizeof` by a parameterization
+over an array whose members have non-zero size. For example `len` could be
+written as `unsafe.Sizeof([...]T)/unsafe.Sizeof(T)` so long as
+`unsafe.Sizeof(T) > 0`.
+
+## Rationale
+
+This approach is simpler than generally providing a constant scalar expression
+parameterization of generic types. Of the two elements of the proposal, neither
+feels particularly out of line with the design of the language or its concepts.
+The `[...]T` syntax exists in the language to imply length inference for array
+literals and is not a hard to imagine concept. It is the deeper requirement to
+make this proposal work.
+
+One potential downside of this proposal is that we're not really using the
+array for anything other than its size which may feel awkward. For that reason
+I've opted to use a constraint which forces the array to use `struct{}` values
+to indicate that the structure of the elements isn't relevant. This awkwardness
+feels justified to side-step introduces scalars to type parameters.
+
+## Compatibility
+
+This proposal is fully backwards compatible with all of the language and also
+the now accepted type parameters proposal.
+
+## Implementation
+
+Neither of the two features of this proposal feel particularly onerous to
+implement. My guess is that the `[...]T` type list constraint would be extremely
+straightforward given an implementation of type parameters. The `len`
+implementation is also likely to be straightforward given the existence of
+both compile-time evaluation of `len` expressions on array types which exist
+in the language and the constant nature of `unsafe.Sizeof`. Maybe there'd be
+some pain in deferring the expression evaluation until after type checking.
+
+[type parameters]: https://go.googlesource.com/proposal/+/refs/heads/master/design/go2draft-type-parameters.md
+[block based data structures]: https://opensource.googleblog.com/2013/01/c-containers-that-save-memory-and-time.html
+[vanilla btree]: https://go2goplay.golang.org/p/A5auAIdW2ZR
+[parameterized node btree]: https://go2goplay.golang.org/p/TFn9BujIlc3
+[bad parameterization btree]: https://go2goplay.golang.org/p/JGgyabtu_9F
+[proposal btree]: https://go2goplay.golang.org/p/4o36RLxF73C
\ No newline at end of file