runtime: avoid potential deadlock when tracing memory code

In reclaimChunk, the runtime is calling traceGCSweepDone() while holding the mheap
lock. traceGCSweepDone() can call traceEvent() and traceFlush(). These functions
not only can get various trace locks, but they may also do memory allocations
(runtime.newobject) that may end up getting the mheap lock. So, there may be
either a self-deadlock or a possible deadlock between multiple threads.

It seems better to release the mheap lock before calling traceGCSweepDone(). It is
fine to release the lock, since the operations to get the index of the chunk of
work to do are atomic. We already release the lock to call sweep, so there is no
new behavior for any of the callers of reclaimChunk.

With this change, mheap is a leaf lock (no other lock is ever acquired while it
is held).

Testing: besides normal all.bash, also ran all.bash with --long enabled, since
it does longer tests of runtime/trace.

Change-Id: I4f8cb66c24bb8d424f24d6c2305b4b8387409248
Reviewed-on: https://go-review.googlesource.com/c/go/+/207846
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
diff --git a/src/runtime/mheap.go b/src/runtime/mheap.go
index 114c97b..d8bba1f 100644
--- a/src/runtime/mheap.go
+++ b/src/runtime/mheap.go
@@ -786,7 +786,9 @@
 // reclaimChunk sweeps unmarked spans that start at page indexes [pageIdx, pageIdx+n).
 // It returns the number of pages returned to the heap.
 //
-// h.lock must be held and the caller must be non-preemptible.
+// h.lock must be held and the caller must be non-preemptible. Note: h.lock may be
+// temporarily unlocked and re-locked in order to do sweeping or if tracing is
+// enabled.
 func (h *mheap) reclaimChunk(arenas []arenaIdx, pageIdx, n uintptr) uintptr {
 	// The heap lock must be held because this accesses the
 	// heapArena.spans arrays using potentially non-live pointers.
@@ -842,8 +844,10 @@
 		n -= uintptr(len(inUse) * 8)
 	}
 	if trace.enabled {
+		unlock(&h.lock)
 		// Account for pages scanned but not reclaimed.
 		traceGCSweepSpan((n0 - nFreed) * pageSize)
+		lock(&h.lock)
 	}
 	return nFreed
 }