shiny/driver/internal/swizzle: detect SSSE3 capable CPUs.

Benchmark numbers for the various implementations, on my amd64 machine:
BenchmarkBGRA-8             3000        498914 ns/op // bgra16
BenchmarkBGRA-8             1000       1702449 ns/op // bgra4
BenchmarkBGRA-8              500       3396861 ns/op // pure Go

Fixes golang/go#12714

Change-Id: I5570b0daeaae431c6beecd6e0ab832e7bc2c11ec
Reviewed-on: https://go-review.googlesource.com/14931
Reviewed-by: Aaron Jacobs <jacobsa@google.com>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
4 files changed