protobuf-go/internal/encoding/wire: SizeVarint optimisation
Replace division by 7 in SizeVarint(). The previous method was optimised by the compiler to use a 64bit multiplication.
This uses 9/64 as 1/7 and unsigned 32bit multiplication (which compiler can optimise further using scaling address modes, lea (ax,ax*8),ax)) and a shift.)
protobuf-go/internal/benchmarks/micro benchmark
name old time/op new time/op delta
EmptyMessage/Wire/Marshal-4 40.0ns ± 1% 39.9ns ± 5% ~ (p=0.683 n=5+5)
EmptyMessage/Wire/Unmarshal-4 20.5ns ± 2% 20.3ns ± 2% ~ (p=0.317 n=5+5)
EmptyMessage/Wire/Validate-4 21.5ns ± 0% 21.5ns ± 1% ~ (p=0.825 n=4+5)
EmptyMessage/Clone-4 135ns ± 2% 136ns ± 1% ~ (p=0.365 n=5+5)
RepeatedInt32/Wire/Marshal-4 4.06µs ± 1% 3.69µs ± 1% -9.05% (p=0.008 n=5+5)
RepeatedInt32/Wire/Unmarshal-4 4.72µs ± 0% 4.55µs ± 2% -3.74% (p=0.008 n=5+5)
RepeatedInt32/Wire/Validate-4 3.08µs ± 2% 2.94µs ± 0% -4.69% (p=0.008 n=5+5)
RepeatedInt32/Clone-4 1.09µs ± 1% 1.09µs ± 0% ~ (p=0.810 n=5+5)
Required/Wire/Marshal-4 296ns ± 1% 293ns ± 0% -0.95% (p=0.000 n=5+4)
Required/Wire/Unmarshal-4 147ns ± 1% 135ns ± 1% -8.17% (p=0.008 n=5+5)
Required/Wire/Validate-4 127ns ± 2% 123ns ± 0% -3.15% (p=0.000 n=5+4)
Required/Clone-4 393ns ± 1% 391ns ± 2% ~ (p=0.238 n=5+5)
Change-Id: Idfe75a9cd80b2bddaf13a8e879403c0c94ebc419
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/221803
Reviewed-by: Damien Neil <dneil@google.com>
diff --git a/internal/encoding/wire/wire.go b/internal/encoding/wire/wire.go
index d7baa7f..e624ff8 100644
--- a/internal/encoding/wire/wire.go
+++ b/internal/encoding/wire/wire.go
@@ -362,7 +362,9 @@
// SizeVarint returns the encoded size of a varint.
// The size is guaranteed to be within 1 and 10, inclusive.
func SizeVarint(v uint64) int {
- return 1 + (bits.Len64(v)-1)/7
+ // This computes 1 + (bits.Len64(v)-1)/7.
+ // 9/64 is a good enough approximation of 1/7
+ return int(9*uint32(bits.Len64(v))+64) / 64
}
// AppendFixed32 appends v to b as a little-endian uint32.