go /
net /
0f34b776818f5479bc7fa4b760b095b5b5193458 go.net/publicsuffix: tighten the encoding from 8 bytes per node to 4.
On the full list (running gen.go with -subset=false):
Before, there were 6086 nodes (at 8 bytes per node) before. After,
there were 6086 nodes (at 4 bytes per node) plus 354 children entries
(at 4 bytes per node). The difference is 22928 bytes.
In comparison, the (crushed) text is 21082 bytes, and for the curious,
the longest label is 36 bytes: "xn--correios-e-telecomunicaes-ghc29a".
All 32 bits in the nodes table are used, but there's wiggle room to
accomodate future changes to effective_tld_names.dat:
The largest children index is 353 (in 9 bits, so max is 511).
The largest node type is 2 (in 2 bits, so max is 3).
The largest text offset is 21080 (in 15 bits, so max is 32767).
The largest text length is 36 (in 6 bits, so max is 63).
benchmark old ns/op new ns/op delta
BenchmarkPublicSuffix 19948 19744 -1.02%
R=dr.volker.dobler
CC=golang-dev
https://golang.org/cl/6999045
4 files changed