Blame - doc/asm.html - go

blob: 771c493cc2f593d0b0dd9b4625b7da1718d41ba4 [file] [log] [blame]

Rob Pike	2fbcb08	2013-11-12 20:04:22 -0800	[diff] [blame]	1	<!--{
				2	"Title": "A Quick Guide to Go's Assembler",
Russ Cox	a664b49	2013-11-13 21:29:34 -0500	[diff] [blame]	3	"Path": "/doc/asm"
Rob Pike	2fbcb08	2013-11-12 20:04:22 -0800	[diff] [blame]	4	}-->
				5
				6	<h2 id="introduction">A Quick Guide to Go's Assembler</h2>
				7
				8	<p>
				9	This document is a quick outline of the unusual form of assembly language used by the <code>gc</code>
				10	suite of Go compilers (<code>6g</code>, <code>8g</code>, etc.).
Rob Pike	edebe10	2014-04-15 16:27:48 -0700	[diff] [blame]	11	The document is not comprehensive.
				12	</p>
				13
				14	<p>
				15	The assembler is based on the input to the Plan 9 assemblers, which is documented in detail
Rob Pike	2fbcb08	2013-11-12 20:04:22 -0800	[diff] [blame]	16	<a href="http://plan9.bell-labs.com/sys/doc/asm.html">on the Plan 9 site</a>.
				17	If you plan to write assembly language, you should read that document although much of it is Plan 9-specific.
				18	This document provides a summary of the syntax and
				19	describes the peculiarities that apply when writing assembly code to interact with Go.
				20	</p>
				21
				22	<p>
				23	The most important thing to know about Go's assembler is that it is not a direct representation of the underlying machine.
				24	Some of the details map precisely to the machine, but some do not.
				25	This is because the compiler suite (see
				26	<a href="http://plan9.bell-labs.com/sys/doc/compiler.html">this description</a>)
				27	needs no assembler pass in the usual pipeline.
				28	Instead, the compiler emits a kind of incompletely defined instruction set, in binary form, which the linker
				29	then completes.
				30	In particular, the linker does instruction selection, so when you see an instruction like <code>MOV</code>
				31	what the linker actually generates for that operation might not be a move instruction at all, perhaps a clear or load.
				32	Or it might correspond exactly to the machine instruction with that name.
				33	In general, machine-specific operations tend to appear as themselves, while more general concepts like
				34	memory move and subroutine call and return are more abstract.
				35	The details vary with architecture, and we apologize for the imprecision; the situation is not well-defined.
				36	</p>
				37
				38	<p>
				39	The assembler program is a way to generate that intermediate, incompletely defined instruction sequence
				40	as input for the linker.
				41	If you want to see what the instructions look like in assembly for a given architecture, say amd64, there
				42	are many examples in the sources of the standard library, in packages such as
				43	<a href="/pkg/runtime/"><code>runtime</code></a> and
				44	<a href="/pkg/math/big/"><code>math/big</code></a>.
				45	You can also examine what the compiler emits as assembly code:
				46	</p>
				47
				48	<pre>
				49	$ cat x.go
				50	package main
				51
				52	func main() {
				53	println(3)
				54	}
				55	$ go tool 6g -S x.go # or: go build -gcflags -S x.go
				56
				57	--- prog list "main" ---
				58	0000 (x.go:3) TEXT main+0(SB),$8-0
				59	0001 (x.go:3) FUNCDATA $0,gcargs·0+0(SB)
				60	0002 (x.go:3) FUNCDATA $1,gclocals·0+0(SB)
				61	0003 (x.go:4) MOVQ $3,(SP)
				62	0004 (x.go:4) PCDATA $0,$8
				63	0005 (x.go:4) CALL ,runtime.printint+0(SB)
				64	0006 (x.go:4) PCDATA $0,$-1
				65	0007 (x.go:4) PCDATA $0,$0
				66	0008 (x.go:4) CALL ,runtime.printnl+0(SB)
				67	0009 (x.go:4) PCDATA $0,$-1
				68	0010 (x.go:5) RET ,
				69	...
				70	</pre>
				71
				72	<p>
				73	The <code>FUNCDATA</code> and <code>PCDATA</code> directives contain information
				74	for use by the garbage collector; they are introduced by the compiler.
				75	</p>
				76
Rob Pike	edebe10	2014-04-15 16:27:48 -0700	[diff] [blame]	77	<!-- Commenting out because the feature is gone but it's popular and may come back.
				78
Rob Pike	2fbcb08	2013-11-12 20:04:22 -0800	[diff] [blame]	79	<p>
				80	To see what gets put in the binary after linking, add the <code>-a</code> flag to the linker:
				81	</p>
				82
				83	<pre>
				84	$ go tool 6l -a x.6 # or: go build -ldflags -a x.go
				85	codeblk [0x2000,0x1d059) at offset 0x1000
				86	002000 main.main \| (3) TEXT main.main+0(SB),$8
				87	002000 65488b0c25a0080000 \| (3) MOVQ 2208(GS),CX
				88	002009 483b21 \| (3) CMPQ SP,(CX)
				89	00200c 7707 \| (3) JHI ,2015
				90	00200e e83da20100 \| (3) CALL ,1c250+runtime.morestack00
				91	002013 ebeb \| (3) JMP ,2000
				92	002015 4883ec08 \| (3) SUBQ $8,SP
				93	002019 \| (3) FUNCDATA $0,main.gcargs·0+0(SB)
				94	002019 \| (3) FUNCDATA $1,main.gclocals·0+0(SB)
				95	002019 48c7042403000000 \| (4) MOVQ $3,(SP)
				96	002021 \| (4) PCDATA $0,$8
				97	002021 e8aad20000 \| (4) CALL ,f2d0+runtime.printint
				98	002026 \| (4) PCDATA $0,$-1
				99	002026 \| (4) PCDATA $0,$0
				100	002026 e865d40000 \| (4) CALL ,f490+runtime.printnl
				101	00202b \| (4) PCDATA $0,$-1
				102	00202b 4883c408 \| (5) ADDQ $8,SP
				103	00202f c3 \| (5) RET ,
				104	...
				105	</pre>
				106
Rob Pike	edebe10	2014-04-15 16:27:48 -0700	[diff] [blame]	107	-->
Rob Pike	2fbcb08	2013-11-12 20:04:22 -0800	[diff] [blame]	108
				109	<h3 id="symbols">Symbols</h3>
				110
				111	<p>
				112	Some symbols, such as <code>PC</code>, <code>R0</code> and <code>SP</code>, are predeclared and refer to registers.
				113	There are two other predeclared symbols, <code>SB</code> (static base) and <code>FP</code> (frame pointer).
				114	All user-defined symbols other than jump labels are written as offsets to these pseudo-registers.
				115	</p>
				116
				117	<p>
				118	The <code>SB</code> pseudo-register can be thought of as the origin of memory, so the symbol <code>foo(SB)</code>
				119	is the name <code>foo</code> as an address in memory.
Russ Cox	202bf8d	2014-10-28 15:51:06 -0400	[diff] [blame]	120	This form is used to name global functions and data.
				121	Adding <code><></code> to the name, as in <code>foo<>(SB)</code>, makes the name
				122	visible only in the current source file, like a top-level <code>static</code> declaration in a C file.
Rob Pike	2fbcb08	2013-11-12 20:04:22 -0800	[diff] [blame]	123	</p>
				124
				125	<p>
Russ Cox	a664b49	2013-11-13 21:29:34 -0500	[diff] [blame]	126	The <code>FP</code> pseudo-register is a virtual frame pointer
				127	used to refer to function arguments.
Rob Pike	2fbcb08	2013-11-12 20:04:22 -0800	[diff] [blame]	128	The compilers maintain a virtual frame pointer and refer to the arguments on the stack as offsets from that pseudo-register.
				129	Thus <code>0(FP)</code> is the first argument to the function,
				130	<code>8(FP)</code> is the second (on a 64-bit machine), and so on.
Russ Cox	a664b49	2013-11-13 21:29:34 -0500	[diff] [blame]	131	When referring to a function argument this way, it is conventional to place the name
				132	at the beginning, as in <code>first_arg+0(FP)</code> and <code>second_arg+8(FP)</code>.
				133	Some of the assemblers enforce this convention, rejecting plain <code>0(FP)</code> and <code>8(FP)</code>.
Russ Cox	202bf8d	2014-10-28 15:51:06 -0400	[diff] [blame]	134	For assembly functions with Go prototypes, <code>go</code> <code>vet</code> will check that the argument names
Russ Cox	a664b49	2013-11-13 21:29:34 -0500	[diff] [blame]	135	and offsets match.
Russ Cox	202bf8d	2014-10-28 15:51:06 -0400	[diff] [blame]	136	On 32-bit systems, the low and high 32 bits of a 64-bit value are distinguished by adding
				137	a <code>_lo</code> or <code>_hi</code> suffix to the name, as in <code>arg_lo+0(FP)</code> or <code>arg_hi+4(FP)</code>.
				138	If a Go prototype does not name its result, the expected assembly name is <code>ret</code>.
Russ Cox	a664b49	2013-11-13 21:29:34 -0500	[diff] [blame]	139	</p>
				140
				141	<p>
				142	The <code>SP</code> pseudo-register is a virtual stack pointer
				143	used to refer to frame-local variables and the arguments being
				144	prepared for function calls.
				145	It points to the top of the local stack frame, so references should use negative offsets
				146	in the range [−framesize, 0):
				147	<code>x-8(SP)</code>, <code>y-4(SP)</code>, and so on.
				148	On architectures with a real register named <code>SP</code>, the name prefix distinguishes
				149	references to the virtual stack pointer from references to the architectural <code>SP</code> register.
				150	That is, <code>x-8(SP)</code> and <code>-8(SP)</code> are different memory locations:
				151	the first refers to the virtual stack pointer pseudo-register, while the second refers to the
				152	hardware's <code>SP</code> register.
Rob Pike	2fbcb08	2013-11-12 20:04:22 -0800	[diff] [blame]	153	</p>
				154
				155	<p>
				156	Instructions, registers, and assembler directives are always in UPPER CASE to remind you
				157	that assembly programming is a fraught endeavor.
Russ Cox	89f185f	2014-06-26 11:54:39 -0400	[diff] [blame]	158	(Exception: the <code>g</code> register renaming on ARM.)
Rob Pike	2fbcb08	2013-11-12 20:04:22 -0800	[diff] [blame]	159	</p>
				160
				161	<p>
				162	In Go object files and binaries, the full name of a symbol is the
				163	package path followed by a period and the symbol name:
				164	<code>fmt.Printf</code> or <code>math/rand.Int</code>.
				165	Because the assembler's parser treats period and slash as punctuation,
				166	those strings cannot be used directly as identifier names.
				167	Instead, the assembler allows the middle dot character U+00B7
				168	and the division slash U+2215 in identifiers and rewrites them to
				169	plain period and slash.
				170	Within an assembler source file, the symbols above are written as
				171	<code>fmt·Printf</code> and <code>math∕rand·Int</code>.
				172	The assembly listings generated by the compilers when using the <code>-S</code> flag
				173	show the period and slash directly instead of the Unicode replacements
				174	required by the assemblers.
				175	</p>
				176
				177	<p>
				178	Most hand-written assembly files do not include the full package path
				179	in symbol names, because the linker inserts the package path of the current
				180	object file at the beginning of any name starting with a period:
				181	in an assembly source file within the math/rand package implementation,
				182	the package's Int function can be referred to as <code>·Int</code>.
				183	This convention avoids the need to hard-code a package's import path in its
				184	own source code, making it easier to move the code from one location to another.
				185	</p>
				186
				187	<h3 id="directives">Directives</h3>
				188
				189	<p>
				190	The assembler uses various directives to bind text and data to symbol names.
				191	For example, here is a simple complete function definition. The <code>TEXT</code>
				192	directive declares the symbol <code>runtime·profileloop</code> and the instructions
				193	that follow form the body of the function.
				194	The last instruction in a <code>TEXT</code> block must be some sort of jump, usually a <code>RET</code> (pseudo-)instruction.
				195	(If it's not, the linker will append a jump-to-itself instruction; there is no fallthrough in <code>TEXTs</code>.)
				196	After the symbol, the arguments are flags (see below)
				197	and the frame size, a constant (but see below):
				198	</p>
				199
				200	<pre>
				201	TEXT runtime·profileloop(SB),NOSPLIT,$8
				202	MOVQ $runtime·profileloop1(SB), CX
				203	MOVQ CX, 0(SP)
				204	CALL runtime·externalthreadhandler(SB)
				205	RET
				206	</pre>
				207
				208	<p>
				209	In the general case, the frame size is followed by an argument size, separated by a minus sign.
Brad Fitzpatrick	6607534	2014-04-27 07:40:48 -0700	[diff] [blame]	210	(It's not a subtraction, just idiosyncratic syntax.)
Rob Pike	2fbcb08	2013-11-12 20:04:22 -0800	[diff] [blame]	211	The frame size <code>$24-8</code> states that the function has a 24-byte frame
				212	and is called with 8 bytes of argument, which live on the caller's frame.
				213	If <code>NOSPLIT</code> is not specified for the <code>TEXT</code>,
				214	the argument size must be provided.
Russ Cox	202bf8d	2014-10-28 15:51:06 -0400	[diff] [blame]	215	For assembly functions with Go prototypes, <code>go</code> <code>vet</code> will check that the
				216	argument size is correct.
Rob Pike	2fbcb08	2013-11-12 20:04:22 -0800	[diff] [blame]	217	</p>
				218
				219	<p>
				220	Note that the symbol name uses a middle dot to separate the components and is specified as an offset from the
				221	static base pseudo-register <code>SB</code>.
				222	This function would be called from Go source for package <code>runtime</code> using the
				223	simple name <code>profileloop</code>.
				224	</p>
				225
				226	<p>
Russ Cox	202bf8d	2014-10-28 15:51:06 -0400	[diff] [blame]	227	Global data symbols are defined by a sequence of initializing
				228	<code>DATA</code> directives followed by a <code>GLOBL</code> directive.
				229	Each <code>DATA</code> directive initializes a section of the
				230	corresponding memory.
				231	The memory not explicitly initialized is zeroed.
				232	The general form of the <code>DATA</code> directive is
Rob Pike	2fbcb08	2013-11-12 20:04:22 -0800	[diff] [blame]	233
				234	<pre>
Russ Cox	202bf8d	2014-10-28 15:51:06 -0400	[diff] [blame]	235	DATA symbol+offset(SB)/width, value
Rob Pike	2fbcb08	2013-11-12 20:04:22 -0800	[diff] [blame]	236	</pre>
				237
				238	<p>
Russ Cox	202bf8d	2014-10-28 15:51:06 -0400	[diff] [blame]	239	which initializes the symbol memory at the given offset and width with the given value.
				240	The <code>DATA</code> directives for a given symbol must be written with increasing offsets.
Rob Pike	2fbcb08	2013-11-12 20:04:22 -0800	[diff] [blame]	241	</p>
				242
				243	<p>
				244	The <code>GLOBL</code> directive declares a symbol to be global.
				245	The arguments are optional flags and the size of the data being declared as a global,
				246	which will have initial value all zeros unless a <code>DATA</code> directive
				247	has initialized it.
				248	The <code>GLOBL</code> directive must follow any corresponding <code>DATA</code> directives.
Russ Cox	202bf8d	2014-10-28 15:51:06 -0400	[diff] [blame]	249	</p>
				250
				251	<p>
				252	For example,
Rob Pike	2fbcb08	2013-11-12 20:04:22 -0800	[diff] [blame]	253	</p>
				254
				255	<pre>
Russ Cox	202bf8d	2014-10-28 15:51:06 -0400	[diff] [blame]	256	DATA divtab<>+0x00(SB)/4, $0xf4f8fcff
				257	DATA divtab<>+0x04(SB)/4, $0xe6eaedf0
				258	...
				259	DATA divtab<>+0x3c(SB)/4, $0x81828384
				260	GLOBL divtab<>(SB), RODATA, $64
				261
				262	GLOBL runtime·tlsoffset(SB), NOPTR, $4
Rob Pike	2fbcb08	2013-11-12 20:04:22 -0800	[diff] [blame]	263	</pre>
				264
				265	<p>
Russ Cox	202bf8d	2014-10-28 15:51:06 -0400	[diff] [blame]	266	declares and initializes <code>divtab<></code>, a read-only 64-byte table of 4-byte integer values,
				267	and declares <code>runtime·tlsoffset</code>, a 4-byte, implicitly zeroed variable that
				268	contains no pointers.
Rob Pike	2fbcb08	2013-11-12 20:04:22 -0800	[diff] [blame]	269	</p>
				270
				271	<p>
				272	There may be one or two arguments to the directives.
				273	If there are two, the first is a bit mask of flags,
				274	which can be written as numeric expressions, added or or-ed together,
				275	or can be set symbolically for easier absorption by a human.
Rob Pike	8bca148	2014-08-12 17:04:45 -0700	[diff] [blame]	276	Their values, defined in the standard <code>#include</code> file <code>textflag.h</code>, are:
Rob Pike	2fbcb08	2013-11-12 20:04:22 -0800	[diff] [blame]	277	</p>
				278
				279	<ul>
				280	<li>
				281	<code>NOPROF</code> = 1
				282	<br>
				283	(For <code>TEXT</code> items.)
				284	Don't profile the marked function. This flag is deprecated.
				285	</li>
				286	<li>
				287	<code>DUPOK</code> = 2
				288	<br>
				289	It is legal to have multiple instances of this symbol in a single binary.
				290	The linker will choose one of the duplicates to use.
				291	</li>
				292	<li>
				293	<code>NOSPLIT</code> = 4
				294	<br>
				295	(For <code>TEXT</code> items.)
				296	Don't insert the preamble to check if the stack must be split.
				297	The frame for the routine, plus anything it calls, must fit in the
				298	spare space at the top of the stack segment.
				299	Used to protect routines such as the stack splitting code itself.
				300	</li>
				301	<li>
				302	<code>RODATA</code> = 8
				303	<br>
				304	(For <code>DATA</code> and <code>GLOBL</code> items.)
				305	Put this data in a read-only section.
				306	</li>
				307	<li>
				308	<code>NOPTR</code> = 16
				309	<br>
				310	(For <code>DATA</code> and <code>GLOBL</code> items.)
				311	This data contains no pointers and therefore does not need to be
				312	scanned by the garbage collector.
				313	</li>
				314	<li>
				315	<code>WRAPPER</code> = 32
				316	<br>
				317	(For <code>TEXT</code> items.)
				318	This is a wrapper function and should not count as disabling <code>recover</code>.
				319	</li>
				320	</ul>
				321
Russ Cox	202bf8d	2014-10-28 15:51:06 -0400	[diff] [blame]	322	<h3 id="runtime">Runtime Coordination</h3>
				323
				324	<p>
				325	For garbage collection to run correctly, the runtime must know the
				326	location of pointers in all global data and in most stack frames.
				327	The Go compiler emits this information when compiling Go source files,
				328	but assembly programs must define it explicitly.
				329	</p>
				330
				331	<p>
				332	A data symbol marked with the <code>NOPTR</code> flag (see above)
				333	is treated as containing no pointers to runtime-allocated data.
				334	A data symbol with the <code>RODATA</code> flag
				335	is allocated in read-only memory and is therefore treated
				336	as implicitly marked <code>NOPTR</code>.
				337	A data symbol with a total size smaller than a pointer
				338	is also treated as implicitly marked <code>NOPTR</code>.
				339	It is not possible to define a symbol containing pointers in an assembly source file;
				340	such a symbol must be defined in a Go source file instead.
				341	Assembly source can still refer to the symbol by name
				342	even without <code>DATA</code> and <code>GLOBL</code> directives.
				343	A good general rule of thumb is to define all non-<code>RODATA</code>
				344	symbols in Go instead of in assembly.
				345	</p>
				346
				347	<p>
				348	Each function also needs annotations giving the location of
				349	live pointers in its arguments, results, and local stack frame.
				350	For an assembly function with no pointer results and
				351	either no local stack frame or no function calls,
				352	the only requirement is to define a Go prototype for the function
				353	in a Go source file in the same package.
				354	For more complex situations, explicit annotation is needed.
				355	These annotations use pseudo-instructions defined in the standard
				356	<code>#include</code> file <code>funcdata.h</code>.
				357	</p>
				358
				359	<p>
				360	If a function has no arguments and no results,
				361	the pointer information can be omitted.
				362	This is indicated by an argument size annotation of <code>$<i>n</i>-0</code>
				363	on the <code>TEXT</code> instruction.
				364	Otherwise, pointer information must be provided by
				365	a Go prototype for the function in a Go source file,
				366	even for assembly functions not called directly from Go.
				367	(The prototype will also let <code>go</code> <code>vet</code> check the argument references.)
				368	At the start of the function, the arguments are assumed
				369	to be initialized but the results are assumed uninitialized.
				370	If the results will hold live pointers during a call instruction,
				371	the function should start by zeroing the results and then
				372	executing the pseudo-instruction <code>GO_RESULTS_INITIALIZED</code>.
				373	This instruction records that the results are now initialized
				374	and should be scanned during stack movement and garbage collection.
				375	It is typically easier to arrange that assembly functions do not
				376	return pointers or do not contain call instructions;
				377	no assembly functions in the standard library use
				378	<code>GO_RESULTS_INITIALIZED</code>.
				379	</p>
				380
				381	<p>
				382	If a function has no local stack frame,
				383	the pointer information can be omitted.
				384	This is indicated by a local frame size annotation of <code>$0-<i>n</i></code>
				385	on the <code>TEXT</code> instruction.
				386	The pointer information can also be omitted if the
				387	function contains no call instructions.
				388	Otherwise, the local stack frame must not contain pointers,
				389	and the assembly must confirm this fact by executing the
				390	pseudo-instruction <code>NO_LOCAL_POINTERS</code>.
				391	Because stack resizing is implemented by moving the stack,
				392	the stack pointer may change during any function call:
				393	even pointers to stack data must not be kept in local variables.
				394	</p>
				395
Rob Pike	2fbcb08	2013-11-12 20:04:22 -0800	[diff] [blame]	396	<h2 id="architectures">Architecture-specific details</h2>
				397
				398	<p>
				399	It is impractical to list all the instructions and other details for each machine.
				400	To see what instructions are defined for a given machine, say 32-bit Intel x86,
				401	look in the top-level header file for the corresponding linker, in this case <code>8l</code>.
				402	That is, the file <code>$GOROOT/src/cmd/8l/8.out.h</code> contains a C enumeration, called <code>as</code>,
				403	of the instructions and their spellings as known to the assembler and linker for that architecture.
				404	In that file you'll find a declaration that begins
				405	</p>
				406
				407	<pre>
				408	enum as
				409	{
				410	AXXX,
				411	AAAA,
				412	AAAD,
				413	AAAM,
				414	AAAS,
				415	AADCB,
				416	...
				417	</pre>
				418
				419	<p>
				420	Each instruction begins with a initial capital <code>A</code> in this list, so <code>AADCB</code>
				421	represents the <code>ADCB</code> (add carry byte) instruction.
				422	The enumeration is in alphabetical order, plus some late additions (<code>AXXX</code> occupies
				423	the zero slot as an invalid instruction).
				424	The sequence has nothing to do with the actual encoding of the machine instructions.
				425	Again, the linker takes care of that detail.
				426	</p>
				427
				428	<p>
				429	One detail evident in the examples from the previous sections is that data in the instructions flows from left to right:
				430	<code>MOVQ</code> <code>$0,</code> <code>CX</code> clears <code>CX</code>.
				431	This convention applies even on architectures where the usual mode is the opposite direction.
				432	</p>
				433
				434	<p>
				435	Here follows some descriptions of key Go-specific details for the supported architectures.
				436	</p>
				437
				438	<h3 id="x86">32-bit Intel 386</h3>
				439
				440	<p>
Russ Cox	89f185f	2014-06-26 11:54:39 -0400	[diff] [blame]	441	The runtime pointer to the <code>g</code> structure is maintained
Rob Pike	2fbcb08	2013-11-12 20:04:22 -0800	[diff] [blame]	442	through the value of an otherwise unused (as far as Go is concerned) register in the MMU.
				443	A OS-dependent macro <code>get_tls</code> is defined for the assembler if the source includes
				444	an architecture-dependent header file, like this:
				445	</p>
				446
				447	<pre>
				448	#include "zasm_GOOS_GOARCH.h"
				449	</pre>
				450
				451	<p>
				452	Within the runtime, the <code>get_tls</code> macro loads its argument register
Russ Cox	89f185f	2014-06-26 11:54:39 -0400	[diff] [blame]	453	with a pointer to the <code>g</code> pointer, and the <code>g</code> struct
				454	contains the <code>m</code> pointer.
Rob Pike	2fbcb08	2013-11-12 20:04:22 -0800	[diff] [blame]	455	The sequence to load <code>g</code> and <code>m</code> using <code>CX</code> looks like this:
				456	</p>
				457
				458	<pre>
				459	get_tls(CX)
Russ Cox	89f185f	2014-06-26 11:54:39 -0400	[diff] [blame]	460	MOVL g(CX), AX // Move g into AX.
				461	MOVL g_m(AX), BX // Move g->m into BX.
Rob Pike	2fbcb08	2013-11-12 20:04:22 -0800	[diff] [blame]	462	</pre>
				463
				464	<h3 id="amd64">64-bit Intel 386 (a.k.a. amd64)</h3>
				465
				466	<p>
				467	The assembly code to access the <code>m</code> and <code>g</code>
				468	pointers is the same as on the 386, except it uses <code>MOVQ</code> rather than
				469	<code>MOVL</code>:
				470	</p>
				471
				472	<pre>
				473	get_tls(CX)
Russ Cox	89f185f	2014-06-26 11:54:39 -0400	[diff] [blame]	474	MOVQ g(CX), AX // Move g into AX.
				475	MOVQ g_m(AX), BX // Move g->m into BX.
Rob Pike	2fbcb08	2013-11-12 20:04:22 -0800	[diff] [blame]	476	</pre>
				477
				478	<h3 id="arm">ARM</h3>
				479
				480	<p>
Russ Cox	89f185f	2014-06-26 11:54:39 -0400	[diff] [blame]	481	The registers <code>R10</code> and <code>R11</code>
Russ Cox	a664b49	2013-11-13 21:29:34 -0500	[diff] [blame]	482	are reserved by the compiler and linker.
				483	</p>
				484
				485	<p>
Russ Cox	89f185f	2014-06-26 11:54:39 -0400	[diff] [blame]	486	<code>R10</code> points to the <code>g</code> (goroutine) structure.
				487	Within assembler source code, this pointer must be referred to as <code>g</code>;
				488	the name <code>R10</code> is not recognized.
Russ Cox	a664b49	2013-11-13 21:29:34 -0500	[diff] [blame]	489	</p>
				490
				491	<p>
				492	To make it easier for people and compilers to write assembly, the ARM linker
				493	allows general addressing forms and pseudo-operations like <code>DIV</code> or <code>MOD</code>
				494	that may not be expressible using a single hardware instruction.
				495	It implements these forms as multiple instructions, often using the <code>R11</code> register
				496	to hold temporary values.
				497	Hand-written assembly can use <code>R11</code>, but doing so requires
				498	being sure that the linker is not also using it to implement any of the other
				499	instructions in the function.
Rob Pike	2fbcb08	2013-11-12 20:04:22 -0800	[diff] [blame]	500	</p>
				501
				502	<p>
				503	When defining a <code>TEXT</code>, specifying frame size <code>$-4</code>
				504	tells the linker that this is a leaf function that does not need to save <code>LR</code> on entry.
				505	</p>
				506
Russ Cox	a664b49	2013-11-13 21:29:34 -0500	[diff] [blame]	507	<p>
				508	The name <code>SP</code> always refers to the virtual stack pointer described earlier.
				509	For the hardware register, use <code>R13</code>.
				510	</p>
Rob Pike	2fbcb08	2013-11-12 20:04:22 -0800	[diff] [blame]	511
				512	<h3 id="unsupported_opcodes">Unsupported opcodes</h3>
				513
				514	<p>
				515	The assemblers are designed to support the compiler so not all hardware instructions
				516	are defined for all architectures: if the compiler doesn't generate it, it might not be there.
				517	If you need to use a missing instruction, there are two ways to proceed.
				518	One is to update the assembler to support that instruction, which is straightforward
				519	but only worthwhile if it's likely the instruction will be used again.
				520	Instead, for simple one-off cases, it's possible to use the <code>BYTE</code>
				521	and <code>WORD</code> directives
				522	to lay down explicit data into the instruction stream within a <code>TEXT</code>.
				523	Here's how the 386 runtime defines the 64-bit atomic load function.
				524	</p>
				525
				526	<pre>
				527	// uint64 atomicload64(uint64 volatile* addr);
				528	// so actually
				529	// void atomicload64(uint64 res, uint64 volatile addr);
				530	TEXT runtime·atomicload64(SB), NOSPLIT, $0-8
Russ Cox	202bf8d	2014-10-28 15:51:06 -0400	[diff] [blame]	531	MOVL ptr+0(FP), AX
				532	LEAL ret_lo+4(FP), BX
				533	BYTE $0x0f; BYTE $0x6f; BYTE $0x00 // MOVQ (%EAX), %MM0
				534	BYTE $0x0f; BYTE $0x7f; BYTE $0x03 // MOVQ %MM0, 0(%EBX)
				535	BYTE $0x0F; BYTE $0x77 // EMMS
Rob Pike	2fbcb08	2013-11-12 20:04:22 -0800	[diff] [blame]	536	RET
				537	</pre>