[release-branch.go1.14] runtime: make the scavenger's pacing logic more defensive

This change adds two bits of logic to the scavenger's pacing. Firstly,
it checks to make sure we scavenged at least one physical page, if we
released a non-zero amount of memory. If we try to release less than one
physical page, most systems will release the whole page, which could
lead to memory corruption down the road, and this is a signal we're in
this situation.

Secondly, the scavenger's pacing logic now checks to see if the time a
scavenging operation takes is measured to be exactly zero or negative.
The exact zero case can happen if time update granularity is too large
to effectively capture the time the scavenging operation took, like on
Windows where the OS timer frequency is generally 1ms. The negative case
should not happen, but we're being defensive (against kernel bugs, bugs
in the runtime, etc.). If either of these cases happen, we fall back to
Go 1.13 behavior: assume the scavenge operation took around 10µs per
physical page. We ignore huge pages in this case because we're in
unknown territory, so we choose to be conservative about pacing (huge
pages could only increase the rate of scavenging).

Currently, the scavenger is broken on Windows because the granularity of
time measurement is around 1 ms, which is too coarse to measure how fast
we're scavenging, so we often end up with a scavenging time of zero,
followed by NaNs and garbage values in the pacing logic, which usually
leads to the scavenger sleeping forever.

For #38617.
Fixes #38856.

Change-Id: Iaaa2a4cbb21338e1258d010f7362ed58b7db1af7
Reviewed-on: https://go-review.googlesource.com/c/go/+/229997
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Austin Clements <austin@google.com>
(cherry picked from commit c7915376ce3cdd172bf71ca4127c67f196b8e43e)
Reviewed-on: https://go-review.googlesource.com/c/go/+/232743
diff --git a/src/runtime/mgcscavenge.go b/src/runtime/mgcscavenge.go
index c262509..7a217dd 100644
--- a/src/runtime/mgcscavenge.go
+++ b/src/runtime/mgcscavenge.go
@@ -286,6 +286,28 @@
 			continue
 		}
 
+		if released < physPageSize {
+			// If this happens, it means that we may have attempted to release part
+			// of a physical page, but the likely effect of that is that it released
+			// the whole physical page, some of which may have still been in-use.
+			// This could lead to memory corruption. Throw.
+			throw("released less than one physical page of memory")
+		}
+
+		// On some platforms we may see crit as zero if the time it takes to scavenge
+		// memory is less than the minimum granularity of its clock (e.g. Windows).
+		// In this case, just assume scavenging takes 10 µs per regular physical page
+		// (determined empirically), and conservatively ignore the impact of huge pages
+		// on timing.
+		//
+		// We shouldn't ever see a crit value less than zero unless there's a bug of
+		// some kind, either on our side or in the platform we're running on, but be
+		// defensive in that case as well.
+		const approxCritNSPerPhysicalPage = 10e3
+		if crit <= 0 {
+			crit = approxCritNSPerPhysicalPage * float64(released/physPageSize)
+		}
+
 		// Multiply the critical time by 1 + the ratio of the costs of using
 		// scavenged memory vs. scavenging memory. This forces us to pay down
 		// the cost of reusing this memory eagerly by sleeping for a longer period