[dev.ssa] cmd/compile: strength-reduce 64-bit constant divides

The frontend does this for 32 bits and below, but SSA needs
to do it for 64 bits.  The algorithms are all copied from
cgen.go:cgen_div.

Speeds up TimeFormat substantially: ~40% slower to ~10% slower.

Change-Id: I023ea2eb6040df98ccd9105e15ca6ea695610a7a
Reviewed-on: https://go-review.googlesource.com/19302
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Todd Neal <todd@tneal.org>
diff --git a/src/cmd/compile/internal/ssa/gen/genericOps.go b/src/cmd/compile/internal/ssa/gen/genericOps.go
index 3c7aa84..ec74859 100644
--- a/src/cmd/compile/internal/ssa/gen/genericOps.go
+++ b/src/cmd/compile/internal/ssa/gen/genericOps.go
@@ -41,7 +41,11 @@
 	{name: "Hmul16u"},
 	{name: "Hmul32"},
 	{name: "Hmul32u"},
-	// frontend currently doesn't generate a 64 bit hmul
+	{name: "Hmul64"},
+	{name: "Hmul64u"},
+
+	// Weird special instruction for strength reduction of divides.
+	{name: "Avg64u"}, // (uint64(arg0) + uint64(arg1)) / 2, correct to all 64 bits.
 
 	{name: "Div8"}, // arg0 / arg1
 	{name: "Div8u"},