runtime: move mem profile sampling into m-acquired section

It was not safe to do mcache profiling updates outside the critical
section, but we got lucky because the runtime was not preemptible.
Adding chunked memory clearing (CL 270943) created preemption
opportunities, which led to corruption of runtime data structures.

Fixes #47304.
Fixes #47302.

Change-Id: I461615470d62328a83ccbac537fbdc6dcde81c85
Reviewed-on: https://go-review.googlesource.com/c/go/+/336449
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
diff --git a/src/runtime/malloc.go b/src/runtime/malloc.go
index 2759bbd..cc22b0f 100644
--- a/src/runtime/malloc.go
+++ b/src/runtime/malloc.go
@@ -1135,13 +1135,21 @@
 		msanmalloc(x, size)
 	}
 
+	if rate := MemProfileRate; rate > 0 {
+		// Note cache c only valid while m acquired; see #47302
+		if rate != 1 && size < c.nextSample {
+			c.nextSample -= size
+		} else {
+			profilealloc(mp, x, size)
+		}
+	}
 	mp.mallocing = 0
 	releasem(mp)
 
 	// Pointerfree data can be zeroed late in a context where preemption can occur.
 	// x will keep the memory alive.
 	if !isZeroed && needzero {
-		memclrNoHeapPointersChunked(size, x)
+		memclrNoHeapPointersChunked(size, x) // This is a possible preemption point: see #47302
 	}
 
 	if debug.malloc {
@@ -1155,16 +1163,6 @@
 		}
 	}
 
-	if rate := MemProfileRate; rate > 0 {
-		if rate != 1 && size < c.nextSample {
-			c.nextSample -= size
-		} else {
-			mp := acquirem()
-			profilealloc(mp, x, size)
-			releasem(mp)
-		}
-	}
-
 	if assistG != nil {
 		// Account for internal fragmentation in the assist
 		// debt now that we know it.