http2/hpack: lazily build huffman table on first use

This generated 120 kB on the heap before at init, regardless of
whether somebody used http2. Worse, because we vendored it into std,
users would have two copies, for about 256 kB of memory. After CL
127235 that went down to 60 kB per copy, so 120 kB for a binary using
golang.org/x/net/http2 explicitly.

With this, it goes to 0 until one of the two copies in the binary
actually uses one of the http2 packages.

I wasn't able to measure any difference with the Once.Do in the decode
path:

name             old time/op    new time/op    delta
HuffmanDecode-4     732ns ± 8%     707ns ± 3%   ~            (p=0.268 n=10+9)

(admittedly noisy)

Change-Id: I6c1065abc0c3458f3cb69e0f678978267ff35ea2
Reviewed-on: https://go-review.googlesource.com/127275
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
diff --git a/http2/hpack/huffman.go b/http2/hpack/huffman.go
index 87ec2aa..b412a96 100644
--- a/http2/hpack/huffman.go
+++ b/http2/hpack/huffman.go
@@ -47,6 +47,7 @@
 // If maxLen is greater than 0, attempts to write more to buf than
 // maxLen bytes will return ErrStringLength.
 func huffmanDecode(buf *bytes.Buffer, maxLen int, v []byte) error {
+	rootHuffmanNode := getRootHuffmanNode()
 	n := rootHuffmanNode
 	// cur is the bit buffer that has not been fed into n.
 	// cbits is the number of low order bits in cur that are valid.
@@ -117,19 +118,28 @@
 	return &node{children: new([256]*node)}
 }
 
-var rootHuffmanNode = newInternalNode()
+var (
+	buildRootOnce       sync.Once
+	lazyRootHuffmanNode *node
+)
 
-func init() {
+func getRootHuffmanNode() *node {
+	buildRootOnce.Do(buildRootHuffmanNode)
+	return lazyRootHuffmanNode
+}
+
+func buildRootHuffmanNode() {
 	if len(huffmanCodes) != 256 {
 		panic("unexpected size")
 	}
+	lazyRootHuffmanNode = newInternalNode()
 	for i, code := range huffmanCodes {
 		addDecoderNode(byte(i), code, huffmanCodeLen[i])
 	}
 }
 
 func addDecoderNode(sym byte, code uint32, codeLen uint8) {
-	cur := rootHuffmanNode
+	cur := lazyRootHuffmanNode
 	for codeLen > 8 {
 		codeLen -= 8
 		i := uint8(code >> codeLen)