cmd/compile: compiler support for buffered write barrier
This CL implements the compiler support for calling the buffered write
barrier added by the previous CL.
Since the buffered write barrier is only implemented on amd64 right
now, this still supports the old, eager write barrier as well. There's
little overhead to supporting both and this way a few tests in
test/fixedbugs that expect to have liveness maps at write barrier
calls can easily opt-in to the old, eager barrier.
This significantly improves the performance of the write barrier:
name old time/op new time/op delta
WriteBarrier-12 73.5ns ±20% 19.2ns ±27% -73.90% (p=0.000 n=19+18)
It also reduces the size of binaries because the write barrier call is
more compact:
name old object-bytes new object-bytes delta
Template 398k ± 0% 393k ± 0% -1.14% (p=0.008 n=5+5)
Unicode 208k ± 0% 206k ± 0% -1.00% (p=0.008 n=5+5)
GoTypes 1.18M ± 0% 1.15M ± 0% -2.00% (p=0.008 n=5+5)
Compiler 4.05M ± 0% 3.88M ± 0% -4.26% (p=0.008 n=5+5)
SSA 8.25M ± 0% 8.11M ± 0% -1.59% (p=0.008 n=5+5)
Flate 228k ± 0% 224k ± 0% -1.83% (p=0.008 n=5+5)
GoParser 295k ± 0% 284k ± 0% -3.62% (p=0.008 n=5+5)
Reflect 1.00M ± 0% 0.99M ± 0% -0.70% (p=0.008 n=5+5)
Tar 339k ± 0% 333k ± 0% -1.67% (p=0.008 n=5+5)
XML 404k ± 0% 395k ± 0% -2.10% (p=0.008 n=5+5)
[Geo mean] 704k 690k -2.00%
name old exe-bytes new exe-bytes delta
HelloSize 1.05M ± 0% 1.04M ± 0% -1.55% (p=0.008 n=5+5)
https://perf.golang.org/search?q=upload:20171027.1
(Amusingly, this also reduces compiler allocations by 0.75%, which,
combined with the better write barrier, speeds up the compiler overall
by 2.10%. See the perf link.)
It slightly improves the performance of most of the go1 benchmarks and
improves the performance of the x/benchmarks:
name old time/op new time/op delta
BinaryTree17-12 2.40s ± 1% 2.47s ± 1% +2.69% (p=0.000 n=19+19)
Fannkuch11-12 2.95s ± 0% 2.95s ± 0% +0.21% (p=0.000 n=20+19)
FmtFprintfEmpty-12 41.8ns ± 4% 41.4ns ± 2% -1.03% (p=0.014 n=20+20)
FmtFprintfString-12 68.7ns ± 2% 67.5ns ± 1% -1.75% (p=0.000 n=20+17)
FmtFprintfInt-12 79.0ns ± 3% 77.1ns ± 1% -2.40% (p=0.000 n=19+17)
FmtFprintfIntInt-12 127ns ± 1% 123ns ± 3% -3.42% (p=0.000 n=20+20)
FmtFprintfPrefixedInt-12 152ns ± 1% 150ns ± 1% -1.02% (p=0.000 n=18+17)
FmtFprintfFloat-12 211ns ± 1% 209ns ± 0% -0.99% (p=0.000 n=20+16)
FmtManyArgs-12 500ns ± 0% 496ns ± 0% -0.73% (p=0.000 n=17+20)
GobDecode-12 6.44ms ± 1% 6.53ms ± 0% +1.28% (p=0.000 n=20+19)
GobEncode-12 5.46ms ± 0% 5.46ms ± 1% ~ (p=0.550 n=19+20)
Gzip-12 220ms ± 1% 216ms ± 0% -1.75% (p=0.000 n=19+19)
Gunzip-12 38.8ms ± 0% 38.6ms ± 0% -0.30% (p=0.000 n=18+19)
HTTPClientServer-12 79.0µs ± 1% 78.2µs ± 1% -1.01% (p=0.000 n=20+20)
JSONEncode-12 11.9ms ± 0% 11.9ms ± 0% -0.29% (p=0.000 n=20+19)
JSONDecode-12 52.6ms ± 0% 52.2ms ± 0% -0.68% (p=0.000 n=19+20)
Mandelbrot200-12 3.69ms ± 0% 3.68ms ± 0% -0.36% (p=0.000 n=20+20)
GoParse-12 3.13ms ± 1% 3.18ms ± 1% +1.67% (p=0.000 n=19+20)
RegexpMatchEasy0_32-12 73.2ns ± 1% 72.3ns ± 1% -1.19% (p=0.000 n=19+18)
RegexpMatchEasy0_1K-12 241ns ± 0% 239ns ± 0% -0.83% (p=0.000 n=17+16)
RegexpMatchEasy1_32-12 68.6ns ± 1% 69.0ns ± 1% +0.47% (p=0.015 n=18+16)
RegexpMatchEasy1_1K-12 364ns ± 0% 361ns ± 0% -0.67% (p=0.000 n=16+17)
RegexpMatchMedium_32-12 104ns ± 1% 103ns ± 1% -0.79% (p=0.001 n=20+15)
RegexpMatchMedium_1K-12 33.8µs ± 3% 34.0µs ± 2% ~ (p=0.267 n=20+19)
RegexpMatchHard_32-12 1.64µs ± 1% 1.62µs ± 2% -1.25% (p=0.000 n=19+18)
RegexpMatchHard_1K-12 49.2µs ± 0% 48.7µs ± 1% -0.93% (p=0.000 n=19+18)
Revcomp-12 391ms ± 5% 396ms ± 7% ~ (p=0.154 n=19+19)
Template-12 63.1ms ± 0% 59.5ms ± 0% -5.76% (p=0.000 n=18+19)
TimeParse-12 307ns ± 0% 306ns ± 0% -0.39% (p=0.000 n=19+17)
TimeFormat-12 325ns ± 0% 323ns ± 0% -0.50% (p=0.000 n=19+19)
[Geo mean] 47.3µs 46.9µs -0.67%
https://perf.golang.org/search?q=upload:20171026.1
name old time/op new time/op delta
Garbage/benchmem-MB=64-12 2.25ms ± 1% 2.20ms ± 1% -2.31% (p=0.000 n=18+18)
HTTP-12 12.6µs ± 0% 12.6µs ± 0% -0.72% (p=0.000 n=18+17)
JSON-12 11.0ms ± 0% 11.0ms ± 1% -0.68% (p=0.000 n=17+19)
https://perf.golang.org/search?q=upload:20171026.2
Updates #14951.
Updates #22460.
Change-Id: Id4c0932890a1d41020071bec73b8522b1367d3e7
Reviewed-on: https://go-review.googlesource.com/73712
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
diff --git a/src/cmd/compile/internal/ssa/gen/genericOps.go b/src/cmd/compile/internal/ssa/gen/genericOps.go
index 5ed0ce2..0ad582b 100644
--- a/src/cmd/compile/internal/ssa/gen/genericOps.go
+++ b/src/cmd/compile/internal/ssa/gen/genericOps.go
@@ -331,6 +331,12 @@
{name: "MoveWB", argLength: 3, typ: "Mem", aux: "TypSize"}, // arg0=destptr, arg1=srcptr, arg2=mem, auxint=size, aux=type. Returns memory.
{name: "ZeroWB", argLength: 2, typ: "Mem", aux: "TypSize"}, // arg0=destptr, arg1=mem, auxint=size, aux=type. Returns memory.
+ // WB invokes runtime.gcWriteBarrier. This is not a normal
+ // call: it takes arguments in registers, doesn't clobber
+ // general-purpose registers (the exact clobber set is
+ // arch-dependent), and is not a safe-point.
+ {name: "WB", argLength: 3, typ: "Mem", aux: "Sym", symEffect: "None"}, // arg0=destptr, arg1=srcptr, arg2=mem, aux=runtime.gcWriteBarrier
+
// Function calls. Arguments to the call have already been written to the stack.
// Return values appear on the stack. The method receiver, if any, is treated
// as a phantom first argument.