runtime, cmd/gc, cmd/ld: ignore method wrappers in recover

Bug #1:

Issue 5406 identified an interesting case:
        defer iface.M()
may end up calling a wrapper that copies an indirect receiver
from the iface value and then calls the real M method. That's
two calls down, not just one, and so recover() == nil always
in the real M method, even during a panic.

[For the purposes of this entire discussion, a wrapper's
implementation is a function containing an ordinary call, not
the optimized tail call form that is somtimes possible. The
tail call does not create a second frame, so it is already
handled correctly.]

Fix this bug by introducing g->panicwrap, which counts the
number of bytes on current stack segment that are due to
wrapper calls that should not count against the recover
check. All wrapper functions must now adjust g->panicwrap up
on entry and back down on exit. This adds slightly to their
expense; on the x86 it is a single instruction at entry and
exit; on the ARM it is three. However, the alternative is to
make a call to recover depend on being able to walk the stack,
which I very much want to avoid. We have enough problems
walking the stack for garbage collection and profiling.
Also, if performance is critical in a specific case, it is already
faster to use a pointer receiver and avoid this kind of wrapper
entirely.

Bug #2:

The old code, which did not consider the possibility of two
calls, already contained a check to see if the call had split
its stack and so the panic-created segment was one behind the
current segment. In the wrapper case, both of the two calls
might split their stacks, so the panic-created segment can be
two behind the current segment.

Fix this by propagating the Stktop.panic flag forward during
stack splits instead of looking backward during recover.

Fixes #5406.

R=golang-dev, iant
CC=golang-dev
https://golang.org/cl/13367052
diff --git a/src/pkg/runtime/asm_386.s b/src/pkg/runtime/asm_386.s
index 79f2e79..5c642c0 100644
--- a/src/pkg/runtime/asm_386.s
+++ b/src/pkg/runtime/asm_386.s
@@ -340,7 +340,7 @@
 	JMP	AX
 
 #define CALLFN(NAME,MAXSIZE)			\
-TEXT runtime·NAME(SB), 0, $MAXSIZE-12;		\
+TEXT runtime·NAME(SB), WRAPPER, $MAXSIZE-12;		\
 	/* copy arguments to stack */		\
 	MOVL	argptr+4(FP), SI;		\
 	MOVL	argsize+8(FP), CX;		\
diff --git a/src/pkg/runtime/asm_amd64.s b/src/pkg/runtime/asm_amd64.s
index a85056c..2c2ffed 100644
--- a/src/pkg/runtime/asm_amd64.s
+++ b/src/pkg/runtime/asm_amd64.s
@@ -319,7 +319,7 @@
 	JMP	AX
 
 #define CALLFN(NAME,MAXSIZE)			\
-TEXT runtime·NAME(SB), 0, $MAXSIZE-20;		\
+TEXT runtime·NAME(SB), WRAPPER, $MAXSIZE-20;		\
 	/* copy arguments to stack */		\
 	MOVQ	argptr+8(FP), SI;		\
 	MOVLQZX argsize+16(FP), CX;		\
diff --git a/src/pkg/runtime/asm_arm.s b/src/pkg/runtime/asm_arm.s
index b66f80e..f483e6f 100644
--- a/src/pkg/runtime/asm_arm.s
+++ b/src/pkg/runtime/asm_arm.s
@@ -300,7 +300,7 @@
 	B	(R1)
 
 #define CALLFN(NAME,MAXSIZE)			\
-TEXT runtime·NAME(SB), 0, $MAXSIZE-12;		\
+TEXT runtime·NAME(SB), WRAPPER, $MAXSIZE-12;		\
 	/* copy arguments to stack */		\
 	MOVW	argptr+4(FP), R0;		\
 	MOVW	argsize+8(FP), R2;		\
diff --git a/src/pkg/runtime/panic.c b/src/pkg/runtime/panic.c
index c14d520..a1e91d3 100644
--- a/src/pkg/runtime/panic.c
+++ b/src/pkg/runtime/panic.c
@@ -339,69 +339,27 @@
 void
 runtime·recover(byte *argp, Eface ret)
 {
-	Stktop *top, *oldtop;
 	Panic *p;
+	Stktop *top;
 
-	// Must be a panic going on.
-	if((p = g->panic) == nil || p->recovered)
-		goto nomatch;
-
-	// Frame must be at the top of the stack segment,
-	// because each deferred call starts a new stack
-	// segment as a side effect of using reflect.call.
-	// (There has to be some way to remember the
-	// variable argument frame size, and the segment
-	// code already takes care of that for us, so we
-	// reuse it.)
-	//
-	// As usual closures complicate things: the fp that
-	// the closure implementation function claims to have
-	// is where the explicit arguments start, after the
-	// implicit pointer arguments and PC slot.
-	// If we're on the first new segment for a closure,
-	// then fp == top - top->args is correct, but if
-	// the closure has its own big argument frame and
-	// allocated a second segment (see below),
-	// the fp is slightly above top - top->args.
-	// That condition can't happen normally though
-	// (stack pointers go down, not up), so we can accept
-	// any fp between top and top - top->args as
-	// indicating the top of the segment.
+	// Must be an unrecovered panic in progress.
+	// Must be on a stack segment created for a deferred call during a panic.
+	// Must be at the top of that segment, meaning the deferred call itself
+	// and not something it called. The top frame in the segment will have
+	// argument pointer argp == top - top->argsize.
+	// The subtraction of g->panicwrap allows wrapper functions that
+	// do not count as official calls to adjust what we consider the top frame
+	// while they are active on the stack. The linker emits adjustments of
+	// g->panicwrap in the prologue and epilogue of functions marked as wrappers.
 	top = (Stktop*)g->stackbase;
-	if(argp < (byte*)top - top->argsize || (byte*)top < argp)
-		goto nomatch;
-
-	// The deferred call makes a new segment big enough
-	// for the argument frame but not necessarily big
-	// enough for the function's local frame (size unknown
-	// at the time of the call), so the function might have
-	// made its own segment immediately.  If that's the
-	// case, back top up to the older one, the one that
-	// reflect.call would have made for the panic.
-	//
-	// The fp comparison here checks that the argument
-	// frame that was copied during the split (the top->args
-	// bytes above top->fp) abuts the old top of stack.
-	// This is a correct test for both closure and non-closure code.
-	oldtop = (Stktop*)top->stackbase;
-	if(oldtop != nil && top->argp == (byte*)oldtop - top->argsize)
-		top = oldtop;
-
-	// Now we have the segment that was created to
-	// run this call.  It must have been marked as a panic segment.
-	if(!top->panic)
-		goto nomatch;
-
-	// Okay, this is the top frame of a deferred call
-	// in response to a panic.  It can see the panic argument.
-	p->recovered = 1;
-	ret = p->arg;
-	FLUSH(&ret);
-	return;
-
-nomatch:
-	ret.type = nil;
-	ret.data = nil;
+	p = g->panic;
+	if(p != nil && !p->recovered && top->panic && argp == (byte*)top - top->argsize - g->panicwrap) {
+		p->recovered = 1;
+		ret = p->arg;
+	} else {
+		ret.type = nil;
+		ret.data = nil;
+	}
 	FLUSH(&ret);
 }
 
diff --git a/src/pkg/runtime/proc.c b/src/pkg/runtime/proc.c
index d37014f..0edd7e0 100644
--- a/src/pkg/runtime/proc.c
+++ b/src/pkg/runtime/proc.c
@@ -1773,6 +1773,7 @@
 	newg->gopc = (uintptr)callerpc;
 	newg->status = Grunnable;
 	newg->goid = runtime·xadd64(&runtime·sched.goidgen, 1);
+	newg->panicwrap = 0;
 	if(raceenabled)
 		newg->racectx = runtime·racegostart((void*)callerpc);
 	runqput(m->p, newg);
diff --git a/src/pkg/runtime/runtime.h b/src/pkg/runtime/runtime.h
index 151804f..9974fa3 100644
--- a/src/pkg/runtime/runtime.h
+++ b/src/pkg/runtime/runtime.h
@@ -250,6 +250,8 @@
 	// stackguard0 can be set to StackPreempt as opposed to stackguard
 	uintptr	stackguard0;	// cannot move - also known to linker, libmach, runtime/cgo
 	uintptr	stackbase;	// cannot move - also known to libmach, runtime/cgo
+	uint32	panicwrap;	// cannot move - also known to linker
+	uint32	selgen;		// valid sudog pointer
 	Defer*	defer;
 	Panic*	panic;
 	Gobuf	sched;
@@ -264,7 +266,6 @@
 	void*	param;		// passed parameter on wakeup
 	int16	status;
 	int64	goid;
-	uint32	selgen;		// valid sudog pointer
 	int8*	waitreason;	// if status==Gwaiting
 	G*	schedlink;
 	bool	ispanic;
@@ -403,6 +404,7 @@
 	uintptr	stackbase;
 	Gobuf	gobuf;
 	uint32	argsize;
+	uint32	panicwrap;
 
 	uint8*	argp;	// pointer to arguments in old frame
 	uintptr	free;	// if free>0, call stackfree using free as size
diff --git a/src/pkg/runtime/stack.c b/src/pkg/runtime/stack.c
index 6b34f09..011c616 100644
--- a/src/pkg/runtime/stack.c
+++ b/src/pkg/runtime/stack.c
@@ -174,6 +174,7 @@
 	gp->stackbase = top->stackbase;
 	gp->stackguard = top->stackguard;
 	gp->stackguard0 = gp->stackguard;
+	gp->panicwrap = top->panicwrap;
 
 	if(top->free != 0) {
 		gp->stacksize -= top->free;
@@ -195,7 +196,7 @@
 runtime·newstack(void)
 {
 	int32 framesize, argsize, oldstatus;
-	Stktop *top;
+	Stktop *top, *oldtop;
 	byte *stk;
 	uintptr sp;
 	uintptr *src, *dst, *dstend;
@@ -316,6 +317,16 @@
 	// copy flag from panic
 	top->panic = gp->ispanic;
 	gp->ispanic = false;
+	
+	// if this isn't a panic, maybe we're splitting the stack for a panic.
+	// if we're splitting in the top frame, propagate the panic flag
+	// forward so that recover will know we're in a panic.
+	oldtop = (Stktop*)top->stackbase;
+	if(oldtop != nil && oldtop->panic && top->argp == (byte*)oldtop - oldtop->argsize - gp->panicwrap)
+		top->panic = true;
+
+	top->panicwrap = gp->panicwrap;
+	gp->panicwrap = 0;
 
 	gp->stackbase = (uintptr)top;
 	gp->stackguard = (uintptr)stk + StackGuard;