cmd/compile: intrinsify slicebytetostringtmp when not instrumenting

when not instrumenting:
- Intrinsify uses of slicebytetostringtmp within the runtime package
  in the ssa backend.
- Pass OARRAYBYTESTRTMP nodes to the compiler backends for lowering
  instead of generating calls to slicebytetostringtmp.

name                    old time/op  new time/op  delta
ConcatStringAndBytes-4  27.9ns ± 2%  24.7ns ± 2%  -11.52%  (p=0.000 n=43+43)

Fixes #17044

Change-Id: I51ce9c3b93284ce526edd0234f094e98580faf2d
Reviewed-on: https://go-review.googlesource.com/29017
Run-TryBot: Martin Möhrmann <martisch@uos.de>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
diff --git a/src/runtime/string_test.go b/src/runtime/string_test.go
index 6aab0ed..4ee32ea 100644
--- a/src/runtime/string_test.go
+++ b/src/runtime/string_test.go
@@ -82,6 +82,13 @@
 	b.SetBytes(int64(len(s1)))
 }
 
+func BenchmarkConcatStringAndBytes(b *testing.B) {
+	s1 := []byte("Gophers!")
+	for i := 0; i < b.N; i++ {
+		_ = "Hello " + string(s1)
+	}
+}
+
 var stringdata = []struct{ name, data string }{
 	{"ASCII", "01234567890"},
 	{"Japanese", "日本語日本語日本語"},