cmd/internal/gc: improve flow of input params to output params

This includes the following information in the per-function summary:

outK = paramJ   encoded in outK bits for paramJ
outK = *paramJ  encoded in outK bits for paramJ
heap = paramJ   EscHeap
heap = *paramJ  EscContentEscapes

Note that (currently) if the address of a parameter is taken and
returned, necessarily a heap allocation occurred to contain that
reference, and the heap can never refer to stack, therefore the
parameter and everything downstream from it escapes to the heap.

The per-function summary information now has a tuneable number of bits
(2 is probably noticeably better than 1, 3 is likely overkill, but it
is now easy to check and the -m debugging output includes information
that allows you to figure out if more would be better.)

A new test was  added to check pointer flow through struct-typed and
*struct-typed parameters and returns; some of these are sensitive to
the number of summary bits, and ought to yield better results with a
more competent escape analysis algorithm.  Another new test checks
(some) correctness with array parameters, results, and operations.

The old analysis inferred a piece of plan9 runtime was non-escaping by
counteracting overconservative analysis with buggy analysis; with the
bug fixed, the result was too conservative (and it's not easy to fix
in this framework) so the source code was tweaked to get the desired
result.  A test was added against the discovered bug.

The escape analysis was further improved splitting the "level" into
3 parts, one tracking the conventional "level" and the other two
computing the highest-level-suffix-from-copy, which is used to
generally model the cancelling effect of indirection applied to
address-of.

With the improved escape analysis enabled, it was necessary to
modify one of the runtime tests because it now attempts to allocate
too much on the (small, fixed-size) G0 (system) stack and this
failed the test.

Compiling src/std after touching src/runtime/*.go with -m logging
turned on shows 420 fewer heap allocation sites (10538 vs 10968).

Profiling allocations in src/html/template with
for i in {1..5} ;
  do go tool 6g -memprofile=mastx.${i}.prof  -memprofilerate=1 *.go;
  go tool pprof -alloc_objects -text  mastx.${i}.prof ;
done

showed a 15% reduction in allocations performed by the compiler.

Update #3753
Update #4720
Fixes #10466

Change-Id: I0fd97d5f5ac527b45f49e2218d158a6e89951432
Reviewed-on: https://go-review.googlesource.com/8202
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
diff --git a/src/cmd/internal/gc/gen.go b/src/cmd/internal/gc/gen.go
index 4c03915..e6af897 100644
--- a/src/cmd/internal/gc/gen.go
+++ b/src/cmd/internal/gc/gen.go
@@ -23,11 +23,10 @@
 	return n
 }
 
-/*
- * the address of n has been taken and might be used after
- * the current function returns.  mark any local vars
- * as needing to move to the heap.
- */
+// addrescapes tags node n as having had its address taken
+// by "increasing" the "value" of n.Esc to EscHeap.
+// Storage is allocated as necessary to allow the address
+// to be taken.
 func addrescapes(n *Node) {
 	switch n.Op {
 	// probably a type error already.
@@ -50,7 +49,7 @@
 		case PPARAMREF:
 			addrescapes(n.Defn)
 
-			// if func param, need separate temporary
+		// if func param, need separate temporary
 		// to hold heap pointer.
 		// the function type has already been checked
 		// (we're in the function body)
@@ -93,12 +92,12 @@
 	case OIND, ODOTPTR:
 		break
 
-		// ODOTPTR has already been introduced,
+	// ODOTPTR has already been introduced,
 	// so these are the non-pointer ODOT and OINDEX.
 	// In &x[0], if x is a slice, then x does not
 	// escape--the pointer inside x does, but that
 	// is always a heap pointer anyway.
-	case ODOT, OINDEX:
+	case ODOT, OINDEX, OPAREN, OCONVNOP:
 		if !Isslice(n.Left.Type) {
 			addrescapes(n.Left)
 		}