go.crypto/sha3: change keccakF to stateless function

Taken from my implementation: https://bitbucket.org/ede/sha3
Performance gain from using less memory and more registers.

benchmark                       old ns/op    new ns/op    delta
BenchmarkPermutationFunction         1484         1118  -24.66%
BenchmarkBulkKeccak512             374993       295178  -21.28%
BenchmarkBulkKeccak256             215496       172335  -20.03%

benchmark                        old MB/s     new MB/s  speedup
BenchmarkPermutationFunction       134.76       178.80    1.33x
BenchmarkBulkKeccak512              43.69        55.51    1.27x
BenchmarkBulkKeccak256              76.03        95.07    1.25x

R=jcb, agl
CC=golang-dev, nigeltao
https://golang.org/cl/8088044
3 files changed