cmd/compile: intrinsics for math/bits.OnesCount
Popcount instructions on amd64 are not guaranteed to be
present, so we must guard their call. Rewrite rules can't
generate control flow at the moment, so the intrinsifier
needs to generate that code.
name old time/op new time/op delta
OnesCount-8 2.47ns ± 5% 1.04ns ± 2% -57.70% (p=0.000 n=10+10)
OnesCount16-8 1.05ns ± 1% 0.78ns ± 0% -25.56% (p=0.000 n=9+8)
OnesCount32-8 1.63ns ± 5% 1.04ns ± 2% -35.96% (p=0.000 n=10+10)
OnesCount64-8 2.45ns ± 0% 1.04ns ± 1% -57.55% (p=0.000 n=6+10)
Update #18616
Change-Id: I4aff2cc9aa93787898d7b22055fe272a7cf95673
Reviewed-on: https://go-review.googlesource.com/38320
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
diff --git a/src/cmd/compile/internal/ssa/gen/genericOps.go b/src/cmd/compile/internal/ssa/gen/genericOps.go
index 7991f32..300a545 100644
--- a/src/cmd/compile/internal/ssa/gen/genericOps.go
+++ b/src/cmd/compile/internal/ssa/gen/genericOps.go
@@ -250,6 +250,11 @@
{name: "BitRev32", argLength: 1}, // Reverse the bits in arg[0]
{name: "BitRev64", argLength: 1}, // Reverse the bits in arg[0]
+ {name: "PopCount8", argLength: 1}, // Count bits in arg[0]
+ {name: "PopCount16", argLength: 1}, // Count bits in arg[0]
+ {name: "PopCount32", argLength: 1}, // Count bits in arg[0]
+ {name: "PopCount64", argLength: 1}, // Count bits in arg[0]
+
{name: "Sqrt", argLength: 1}, // sqrt(arg0), float64 only
// Data movement, max argument length for Phi is indefinite so just pick