cmd/compile: use a sync.Pool and string interning when printing types
CL 214239 improved type printing, but introduced performance regressions:
3-5% memory increase, 1-2% CPU time increase.
There were two primary sources of the memory regression:
* allocating a bytes.Buffer for every type to print
* always printing to that buffer, even when we could return a constant string
This change addresses both of those regressions.
The sync.Pool allows buffer re-use.
String interning prevents allocation for re-used strings.
It addresses some, but not all, of the CPU time regression.
Memory performance impact vs master:
name old alloc/op new alloc/op delta
Template 37.6MB ± 0% 36.3MB ± 0% -3.30% (p=0.008 n=5+5)
Unicode 28.7MB ± 0% 28.3MB ± 0% -1.55% (p=0.008 n=5+5)
GoTypes 127MB ± 0% 122MB ± 0% -4.38% (p=0.008 n=5+5)
Compiler 584MB ± 0% 568MB ± 0% -2.72% (p=0.008 n=5+5)
SSA 1.99GB ± 0% 1.95GB ± 0% -1.97% (p=0.008 n=5+5)
Flate 23.5MB ± 0% 22.8MB ± 0% -2.84% (p=0.008 n=5+5)
GoParser 29.2MB ± 0% 28.0MB ± 0% -4.17% (p=0.008 n=5+5)
Reflect 81.9MB ± 0% 78.6MB ± 0% -4.09% (p=0.008 n=5+5)
Tar 35.3MB ± 0% 34.1MB ± 0% -3.29% (p=0.008 n=5+5)
XML 45.5MB ± 0% 44.3MB ± 0% -2.61% (p=0.008 n=5+5)
[Geo mean] 82.4MB 79.9MB -3.09%
name old allocs/op new allocs/op delta
Template 394k ± 0% 363k ± 0% -7.73% (p=0.008 n=5+5)
Unicode 340k ± 0% 329k ± 0% -3.25% (p=0.008 n=5+5)
GoTypes 1.41M ± 0% 1.28M ± 0% -9.54% (p=0.008 n=5+5)
Compiler 5.77M ± 0% 5.39M ± 0% -6.58% (p=0.008 n=5+5)
SSA 19.1M ± 0% 18.1M ± 0% -5.13% (p=0.008 n=5+5)
Flate 247k ± 0% 228k ± 0% -7.50% (p=0.008 n=5+5)
GoParser 325k ± 0% 295k ± 0% -9.24% (p=0.008 n=5+5)
Reflect 1.04M ± 0% 0.95M ± 0% -8.48% (p=0.008 n=5+5)
Tar 365k ± 0% 336k ± 0% -7.93% (p=0.008 n=5+5)
XML 449k ± 0% 417k ± 0% -7.10% (p=0.008 n=5+5)
[Geo mean] 882k 818k -7.26%
Memory performance going from 52c4488471ed52085a29e173226b3cbd2bf22b20,
which is the commit preceding CL 214239, to this change:
name old alloc/op new alloc/op delta
Template 36.5MB ± 0% 36.3MB ± 0% -0.37% (p=0.008 n=5+5)
Unicode 28.3MB ± 0% 28.3MB ± 0% -0.06% (p=0.008 n=5+5)
GoTypes 123MB ± 0% 122MB ± 0% -0.64% (p=0.008 n=5+5)
Compiler 571MB ± 0% 568MB ± 0% -0.51% (p=0.008 n=5+5)
SSA 1.96GB ± 0% 1.95GB ± 0% -0.13% (p=0.008 n=5+5)
Flate 22.8MB ± 0% 22.8MB ± 0% ~ (p=0.421 n=5+5)
GoParser 28.1MB ± 0% 28.0MB ± 0% -0.37% (p=0.008 n=5+5)
Reflect 78.8MB ± 0% 78.6MB ± 0% -0.32% (p=0.008 n=5+5)
Tar 34.3MB ± 0% 34.1MB ± 0% -0.35% (p=0.008 n=5+5)
XML 44.3MB ± 0% 44.3MB ± 0% +0.05% (p=0.032 n=5+5)
[Geo mean] 80.1MB 79.9MB -0.27%
name old allocs/op new allocs/op delta
Template 372k ± 0% 363k ± 0% -2.46% (p=0.008 n=5+5)
Unicode 333k ± 0% 329k ± 0% -0.97% (p=0.008 n=5+5)
GoTypes 1.33M ± 0% 1.28M ± 0% -3.71% (p=0.008 n=5+5)
Compiler 5.53M ± 0% 5.39M ± 0% -2.50% (p=0.008 n=5+5)
SSA 18.3M ± 0% 18.1M ± 0% -1.22% (p=0.008 n=5+5)
Flate 234k ± 0% 228k ± 0% -2.44% (p=0.008 n=5+5)
GoParser 305k ± 0% 295k ± 0% -3.23% (p=0.008 n=5+5)
Reflect 980k ± 0% 949k ± 0% -3.12% (p=0.008 n=5+5)
Tar 345k ± 0% 336k ± 0% -2.69% (p=0.008 n=5+5)
XML 425k ± 0% 417k ± 0% -1.72% (p=0.008 n=5+5)
[Geo mean] 838k 818k -2.41%
Remaining CPU time regression, that is,
the change from before CL 214239 to this change:
name old time/op new time/op delta
Template 208ms ± 2% 209ms ± 1% ~ (p=0.181 n=47+46)
Unicode 82.9ms ± 2% 81.9ms ± 2% -1.25% (p=0.000 n=50+48)
GoTypes 709ms ± 3% 714ms ± 3% +0.77% (p=0.003 n=48+49)
Compiler 3.31s ± 2% 3.32s ± 2% ~ (p=0.271 n=48+48)
SSA 10.8s ± 1% 10.9s ± 1% +0.61% (p=0.000 n=46+47)
Flate 134ms ± 2% 134ms ± 1% +0.41% (p=0.002 n=48+46)
GoParser 166ms ± 2% 167ms ± 2% +0.41% (p=0.010 n=46+48)
Reflect 440ms ± 4% 444ms ± 4% +1.05% (p=0.002 n=50+49)
Tar 183ms ± 2% 184ms ± 2% ~ (p=0.074 n=45+45)
XML 247ms ± 2% 248ms ± 2% +0.67% (p=0.001 n=49+48)
[Geo mean] 425ms 427ms +0.34%
name old user-time/op new user-time/op delta
Template 271ms ± 2% 271ms ± 2% ~ (p=0.654 n=48+48)
Unicode 117ms ± 2% 116ms ± 3% ~ (p=0.458 n=47+45)
GoTypes 952ms ± 3% 963ms ± 2% +1.11% (p=0.000 n=48+49)
Compiler 4.50s ± 5% 4.49s ± 7% ~ (p=0.894 n=50+50)
SSA 15.0s ± 2% 15.1s ± 2% +0.46% (p=0.015 n=50+49)
Flate 166ms ± 2% 167ms ± 2% +0.40% (p=0.005 n=49+48)
GoParser 202ms ± 2% 203ms ± 2% +0.60% (p=0.002 n=49+47)
Reflect 583ms ± 3% 588ms ± 3% +0.82% (p=0.001 n=49+46)
Tar 223ms ± 2% 224ms ± 2% +0.37% (p=0.046 n=48+46)
XML 310ms ± 2% 311ms ± 2% +0.46% (p=0.009 n=50+49)
[Geo mean] 554ms 556ms +0.36%
Change-Id: I85951a6538373ef4309a2cc366cc1ebaf1f4582d
Reviewed-on: https://go-review.googlesource.com/c/go/+/214818
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
diff --git a/src/cmd/compile/internal/gc/fmt.go b/src/cmd/compile/internal/gc/fmt.go
index 54886a9..d7fc541 100644
--- a/src/cmd/compile/internal/gc/fmt.go
+++ b/src/cmd/compile/internal/gc/fmt.go
@@ -12,6 +12,7 @@
"io"
"strconv"
"strings"
+ "sync"
"unicode/utf8"
)
@@ -651,10 +652,19 @@
TBLANK: "blank",
}
+var tconvBufferPool = sync.Pool{
+ New: func() interface{} {
+ return new(bytes.Buffer)
+ },
+}
+
func tconv(t *types.Type, flag FmtFlag, mode fmtMode) string {
- b := bytes.NewBuffer(make([]byte, 0, 64))
- tconv2(b, t, flag, mode, nil)
- return b.String()
+ buf := tconvBufferPool.Get().(*bytes.Buffer)
+ buf.Reset()
+ defer tconvBufferPool.Put(buf)
+
+ tconv2(buf, t, flag, mode, nil)
+ return types.InternString(buf.Bytes())
}
// tconv2 writes a string representation of t to b.