runtime: provide room for first 4 syscall parameters in windows usleep2

Windows amd64 requires all syscall callers to provide room for first
4 parameters on stack. We do that for all our syscalls, except inside
of usleep2. In https://codereview.appspot.com/7563043#msg3 rsc says:

"We don't need the stack alignment and first 4 parameters on amd64
because it's just a system call, not an ordinary function call."

He seems to be wrong on both counts. But alignment is already fixed.
Fix parameter space now too.

Fixes #12444

Change-Id: I66a2a18d2f2c3846e3aa556cc3acc8ec6240bea0
Reviewed-on: https://go-review.googlesource.com/14282
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
diff --git a/src/runtime/sys_windows_amd64.s b/src/runtime/sys_windows_amd64.s
index ea4f3e0..b15eacb 100644
--- a/src/runtime/sys_windows_amd64.s
+++ b/src/runtime/sys_windows_amd64.s
@@ -428,19 +428,21 @@
 	RET
 
 // Runs on OS stack. duration (in 100ns units) is in BX.
-TEXT runtime·usleep2(SB),NOSPLIT,$16
+// The function leaves room for 4 syscall parameters
+// (as per windows amd64 calling convention).
+TEXT runtime·usleep2(SB),NOSPLIT,$48
 	MOVQ	SP, AX
 	ANDQ	$~15, SP	// alignment as per Windows requirement
-	MOVQ	AX, 8(SP)
+	MOVQ	AX, 40(SP)
 	// Want negative 100ns units.
 	NEGQ	BX
-	MOVQ	SP, R8 // ptime
+	LEAQ	32(SP), R8  // ptime
 	MOVQ	BX, (R8)
 	MOVQ	$-1, CX // handle
 	MOVQ	$0, DX // alertable
 	MOVQ	runtime·_NtWaitForSingleObject(SB), AX
 	CALL	AX
-	MOVQ	8(SP), SP
+	MOVQ	40(SP), SP
 	RET
 
 // func now() (sec int64, nsec int32)