runtime: fix mcall unwinding on Windows

The Windows native stack unwinder incorrectly classifies the next
instruction after the mcall callback call as being part of the function
epilogue, producing a wrong call stack.

Add a NOP after the callback call to work around this issue.

Fixes #67007.

Change-Id: I6017635da895b272b1852391db9a255ca69e335d
Reviewed-on: https://go-review.googlesource.com/c/go/+/581335
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Quim Muntal <quimmuntal@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
diff --git a/src/runtime/asm_amd64.s b/src/runtime/asm_amd64.s
index 1071d27..cb21629 100644
--- a/src/runtime/asm_amd64.s
+++ b/src/runtime/asm_amd64.s
@@ -456,6 +456,10 @@
 	PUSHQ	AX	// open up space for fn's arg spill slot
 	MOVQ	0(DX), R12
 	CALL	R12		// fn(g)
+	// The Windows native stack unwinder incorrectly classifies the next instruction
+	// as part of the function epilogue, producing a wrong call stack.
+	// Add a NOP to work around this issue. See go.dev/issue/67007.
+	BYTE	$0x90
 	POPQ	AX
 	JMP	runtime·badmcall2(SB)
 	RET