cases: prepare for Unicode 8.0.0

There is not enough space in the exceptions table to accomodate the
Cherokee code points (all of which have large XOR masks).
In the old encoding:
There are 79*2=158 affected Cherokee letters. Each requires a 16-bit
XOR-mask, which does not fit in the available 10 bits. This means
that for each symbol we have a 3+1 bytes (more when folding is added)
of exception data, for a total of 632 bytes, which puts it over the
allocated 11 bits of addressable space.  We have one bit left in the
trie entry that could be used for this, but doing so has the
additional issue that each Cherokee letter will get a different
entry in the trie, making the blocks not compressable, further
increasing cost. Overall, just Cherokee would add about 7% to the
overall table size.

We opt for the following alternative:
We now have two types of XOR patterns: inline (6 bits; the bulk of
the mappings), and an index into a XOR pattern of any length. This
allows for more flexibility, including XOR-ing bits 6 and 7 of a
rune and XOR-ing 3- and 4-byte wide UTF-8 sequences, all of which
was previously not possible. This allows for Cherokee to be XOR-ed,
as well as a bunch of other runes for which this previously wasn't
possible. The result is that for Unicode 7.0.0, the table size is
reduced by over 5%. Also the algorithm is conceptionally simpler
(no more bit fiddling to map a rune-based XOR pattern onto UTF-8
bytes). The extra indirection may induce some performance, but the
common path is not affected and performance is only possibly
increased for rare runes.

This change does not yet include the bump to Unicode 8.0.0.
The resulting Unicode 8.0.0 table is smaller than the
original Unicode 7.0.0 table (see follow-up CLs).

Change-Id: I017906b6fb1275ea81d2f528fa42f064cbf9f761
Reviewed-on: https://go-review.googlesource.com/11370
Reviewed-by: Nigel Tao <nigeltao@golang.org>
5 files changed