sync: add fast paths to WaitGroup
benchmark                                        old ns/op    new ns/op    delta
BenchmarkWaitGroupUncontended                        93.50        33.60  -64.06%
BenchmarkWaitGroupUncontended-2                      44.30        16.90  -61.85%
BenchmarkWaitGroupUncontended-4                      21.80         8.47  -61.15%
BenchmarkWaitGroupUncontended-8                      12.10         4.86  -59.83%
BenchmarkWaitGroupUncontended-16                      7.38         3.35  -54.61%
BenchmarkWaitGroupAddDone                            58.40        33.70  -42.29%
BenchmarkWaitGroupAddDone-2                         293.00        85.80  -70.72%
BenchmarkWaitGroupAddDone-4                         243.00        51.10  -78.97%
BenchmarkWaitGroupAddDone-8                         236.00        52.20  -77.88%
BenchmarkWaitGroupAddDone-16                        215.00        43.30  -79.86%
BenchmarkWaitGroupAddDoneWork                       826.00       794.00   -3.87%
BenchmarkWaitGroupAddDoneWork-2                     450.00       424.00   -5.78%
BenchmarkWaitGroupAddDoneWork-4                     277.00       220.00  -20.58%
BenchmarkWaitGroupAddDoneWork-8                     440.00       116.00  -73.64%
BenchmarkWaitGroupAddDoneWork-16                    569.00        66.50  -88.31%
BenchmarkWaitGroupWait                               29.00         8.04  -72.28%
BenchmarkWaitGroupWait-2                             74.10         4.15  -94.40%
BenchmarkWaitGroupWait-4                            117.00         2.30  -98.03%
BenchmarkWaitGroupWait-8                            111.00         1.31  -98.82%
BenchmarkWaitGroupWait-16                           104.00         1.27  -98.78%
BenchmarkWaitGroupWaitWork                          802.00       792.00   -1.25%
BenchmarkWaitGroupWaitWork-2                        411.00       401.00   -2.43%
BenchmarkWaitGroupWaitWork-4                        210.00       199.00   -5.24%
BenchmarkWaitGroupWaitWork-8                        206.00       105.00  -49.03%
BenchmarkWaitGroupWaitWork-16                       334.00        54.40  -83.71%

R=rsc
CC=golang-dev
https://golang.org/cl/4672050
7 files changed