libgo: reduce overhead for memory/block/mutex profiling

Revise the gccgo version of memory/block/mutex profiling to reduce
runtime overhead. The main change is to collect raw stack traces while
the profile is on line, then post-process the stacks just prior to the
point where we are ready to use the final product. Memory profiling
(at a very low sampling rate) is enabled by default, and the overhead
of the symbolization / DWARF-reading from backtrace_full was slowing
things down relative to the main Go runtime.

Change-Id: I40be7071fe6468dc9d2a2e8e84897de916ff67c7
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/171497
Reviewed-by: Ian Lance Taylor <iant@golang.org>
10 files changed