runtime: force workers out before checking mark roots
Currently we check that all roots are marked as soon as gcMarkDone
decides to transition from mark 1 to mark 2. However, issue #16083
indicates that there may be a race where we try to complete mark 1
while a worker is still scanning a stack, causing the root mark check
to fail.
We don't yet understand this race, but as a simple mitigation, move
the root check to after gcMarkDone performs a ragged barrier, which
will force any remaining workers to finish their current job.
Updates #16083. This may "fix" it, but it would be better to
understand and fix the underlying race.
Change-Id: I1af9ce67bd87ade7bc2a067295d79c28cd11abd2
Reviewed-on: https://go-review.googlesource.com/35353
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
diff --git a/src/runtime/mgc.go b/src/runtime/mgc.go
index 64a2f3a..0b996d8 100644
--- a/src/runtime/mgc.go
+++ b/src/runtime/mgc.go
@@ -1129,8 +1129,6 @@
// sitting in the per-P work caches.
// Flush and disable work caches.
- gcMarkRootCheck()
-
// Disallow caching workbufs and indicate that we're in mark 2.
gcBlackenPromptly = true
@@ -1153,6 +1151,16 @@
})
})
+ // Check that roots are marked. We should be able to
+ // do this before the forEachP, but based on issue
+ // #16083 there may be a (harmless) race where we can
+ // enter mark 2 while some workers are still scanning
+ // stacks. The forEachP ensures these scans are done.
+ //
+ // TODO(austin): Figure out the race and fix this
+ // properly.
+ gcMarkRootCheck()
+
// Now we can start up mark 2 workers.
atomic.Xaddint64(&gcController.dedicatedMarkWorkersNeeded, 0xffffffff)
atomic.Xaddint64(&gcController.fractionalMarkWorkersNeeded, 0xffffffff)