all: remove 'extern register M *m' from runtime
The runtime has historically held two dedicated values g (current goroutine)
and m (current thread) in 'extern register' slots (TLS on x86, real registers
backed by TLS on ARM).
This CL removes the extern register m; code now uses g->m.
On ARM, this frees up the register that formerly held m (R9).
This is important for NaCl, because NaCl ARM code cannot use R9 at all.
The Go 1 macrobenchmarks (those with per-op times >= 10 µs) are unaffected:
BenchmarkBinaryTree17 5491374955 5471024381 -0.37%
BenchmarkFannkuch11 4357101311 4275174828 -1.88%
BenchmarkGobDecode 11029957 11364184 +3.03%
BenchmarkGobEncode 6852205 6784822 -0.98%
BenchmarkGzip 650795967 650152275 -0.10%
BenchmarkGunzip 140962363 141041670 +0.06%
BenchmarkHTTPClientServer 71581 73081 +2.10%
BenchmarkJSONEncode 31928079 31913356 -0.05%
BenchmarkJSONDecode 117470065 113689916 -3.22%
BenchmarkMandelbrot200 6008923 5998712 -0.17%
BenchmarkGoParse 6310917 6327487 +0.26%
BenchmarkRegexpMatchMedium_1K 114568 114763 +0.17%
BenchmarkRegexpMatchHard_1K 168977 169244 +0.16%
BenchmarkRevcomp 935294971 914060918 -2.27%
BenchmarkTemplate 145917123 148186096 +1.55%
Minux previous reported larger variations, but these were caused by
run-to-run noise, not repeatable slowdowns.
Actual code changes by Minux.
I only did the docs and the benchmarking.
LGTM=dvyukov, iant, minux
R=minux, josharian, iant, dave, bradfitz, dvyukov
CC=golang-codereviews
https://golang.org/cl/109050043
diff --git a/doc/asm.html b/doc/asm.html
index d44cb79..f4ef1e6 100644
--- a/doc/asm.html
+++ b/doc/asm.html
@@ -149,7 +149,7 @@
<p>
Instructions, registers, and assembler directives are always in UPPER CASE to remind you
that assembly programming is a fraught endeavor.
-(Exceptions: the <code>m</code> and <code>g</code> register renamings on ARM.)
+(Exception: the <code>g</code> register renaming on ARM.)
</p>
<p>
@@ -344,7 +344,7 @@
<h3 id="x86">32-bit Intel 386</h3>
<p>
-The runtime pointers to the <code>m</code> and <code>g</code> structures are maintained
+The runtime pointer to the <code>g</code> structure is maintained
through the value of an otherwise unused (as far as Go is concerned) register in the MMU.
A OS-dependent macro <code>get_tls</code> is defined for the assembler if the source includes
an architecture-dependent header file, like this:
@@ -356,14 +356,15 @@
<p>
Within the runtime, the <code>get_tls</code> macro loads its argument register
-with a pointer to a pair of words representing the <code>g</code> and <code>m</code> pointers.
+with a pointer to the <code>g</code> pointer, and the <code>g</code> struct
+contains the <code>m</code> pointer.
The sequence to load <code>g</code> and <code>m</code> using <code>CX</code> looks like this:
</p>
<pre>
get_tls(CX)
-MOVL g(CX), AX // Move g into AX.
-MOVL m(CX), BX // Move m into BX.
+MOVL g(CX), AX // Move g into AX.
+MOVL g_m(AX), BX // Move g->m into BX.
</pre>
<h3 id="amd64">64-bit Intel 386 (a.k.a. amd64)</h3>
@@ -376,22 +377,21 @@
<pre>
get_tls(CX)
-MOVQ g(CX), AX // Move g into AX.
-MOVQ m(CX), BX // Move m into BX.
+MOVQ g(CX), AX // Move g into AX.
+MOVQ g_m(AX), BX // Move g->m into BX.
</pre>
<h3 id="arm">ARM</h3>
<p>
-The registers <code>R9</code>, <code>R10</code>, and <code>R11</code>
+The registers <code>R10</code> and <code>R11</code>
are reserved by the compiler and linker.
</p>
<p>
-<code>R9</code> and <code>R10</code> point to the <code>m</code> (machine) and <code>g</code>
-(goroutine) structures, respectively.
-Within assembler source code, these pointers must be referred to as <code>m</code> and <code>g</code>;
-the names <code>R9</code> and <code>R10</code> are not recognized.
+<code>R10</code> points to the <code>g</code> (goroutine) structure.
+Within assembler source code, this pointer must be referred to as <code>g</code>;
+the name <code>R10</code> is not recognized.
</p>
<p>