runtime: use duff zero and copy to initialize memory

benchmark                 old ns/op     new ns/op     delta
BenchmarkCopyFat512       1307          329           -74.83%
BenchmarkCopyFat256       666           169           -74.62%
BenchmarkCopyFat1024      2617          671           -74.36%
BenchmarkCopyFat128       343           89.0          -74.05%
BenchmarkCopyFat64        182           48.9          -73.13%
BenchmarkCopyFat32        103           28.8          -72.04%
BenchmarkClearFat128      102           46.6          -54.31%
BenchmarkClearFat512      344           167           -51.45%
BenchmarkClearFat64       50.5          26.5          -47.52%
BenchmarkClearFat256      147           87.2          -40.68%
BenchmarkClearFat32       22.7          16.4          -27.75%
BenchmarkClearFat1024     511           662           +29.55%

Fixes #7624

LGTM=rsc
R=golang-codereviews, khr, bradfitz, josharian, dave, rsc
CC=golang-codereviews
https://golang.org/cl/92760044
10 files changed