|  | // Copyright 2019 The Go Authors. All rights reserved. | 
|  | // Use of this source code is governed by a BSD-style | 
|  | // license that can be found in the LICENSE file. | 
|  |  | 
|  | /* | 
|  | Package ppc64 implements a PPC64 assembler that assembles Go asm into | 
|  | the corresponding PPC64 binary instructions as defined by the Power | 
|  | ISA. Since POWER8 is the minimum instruction set used by GOARCHes ppc64le | 
|  | and ppc64, refer to ISA 2.07B or later for details. | 
|  |  | 
|  | This document provides information on how to write code in Go assembler | 
|  | for PPC64, focusing on the differences between Go and PPC64 assembly language. | 
|  | It assumes some knowledge of PPC64 assembler. The original implementation of | 
|  | PPC64 in Go defined many opcodes that are different from PPC64 opcodes, but | 
|  | updates to the Go assembly language used mnemonics that are mostly similar if not | 
|  | identical to the PPC64 mneumonics, such as VMX and VSX instructions. Not all detail | 
|  | is included here; refer to the Power ISA document if interested in more detail. | 
|  |  | 
|  | In the examples below, the Go assembly is on the left, PPC64 assembly on the right. | 
|  |  | 
|  | 1. Operand ordering | 
|  |  | 
|  | In Go asm, the last operand (right) is the target operand, but with PPC64 asm, | 
|  | the first operand (left) is the target. In general, the remaining operands are | 
|  | in the same order except in a few special cases, especially those with 4 operands. | 
|  |  | 
|  | Example: | 
|  | ADD R3, R4, R5		<=>	add r5, r3, r4 | 
|  |  | 
|  | 2. Constant operands | 
|  |  | 
|  | In Go asm, an operand that starts with '$' indicates a constant value. If the | 
|  | instruction using the constant has an immediate version of the opcode, then an | 
|  | immediate value is used with the opcode if possible. | 
|  |  | 
|  | Example: | 
|  | ADD $1, R3, R4		<=> 	addi r4, r3, 1 | 
|  |  | 
|  | 3. Opcodes setting condition codes | 
|  |  | 
|  | In PPC64 asm, some instructions other than compares have variations that can set | 
|  | the condition code where meaningful. This is indicated by adding '.' to the end | 
|  | of the PPC64 instruction. In Go asm, these instructions have 'CC' at the end of | 
|  | the opcode. The possible settings of the condition code depend on the instruction. | 
|  | CR0 is the default for fixed-point instructions; CR1 for floating point; CR6 for | 
|  | vector instructions. | 
|  |  | 
|  | Example: | 
|  | ANDCC R3, R4, R5		<=>	and. r5, r3, r4 (set CR0) | 
|  |  | 
|  | 4. Loads and stores from memory | 
|  |  | 
|  | In Go asm, opcodes starting with 'MOV' indicate a load or store. When the target | 
|  | is a memory reference, then it is a store; when the target is a register and the | 
|  | source is a memory reference, then it is a load. | 
|  |  | 
|  | MOV{B,H,W,D} variations identify the size as byte, halfword, word, doubleword. | 
|  |  | 
|  | Adding 'Z' to the opcode for a load indicates zero extend; if omitted it is sign extend. | 
|  | Adding 'U' to a load or store indicates an update of the base register with the offset. | 
|  | Adding 'BR' to an opcode indicates byte-reversed load or store, or the order opposite | 
|  | of the expected endian order. If 'BR' is used then zero extend is assumed. | 
|  |  | 
|  | Memory references n(Ra) indicate the address in Ra + n. When used with an update form | 
|  | of an opcode, the value in Ra is incremented by n. | 
|  |  | 
|  | Memory references (Ra+Rb) or (Ra)(Rb) indicate the address Ra + Rb, used by indexed | 
|  | loads or stores. Both forms are accepted. When used with an update then the base register | 
|  | is updated by the value in the index register. | 
|  |  | 
|  | Examples: | 
|  | MOVD (R3), R4		<=>	ld r4,0(r3) | 
|  | MOVW (R3), R4		<=>	lwa r4,0(r3) | 
|  | MOVWZU 4(R3), R4		<=>	lwzu r4,4(r3) | 
|  | MOVWZ (R3+R5), R4		<=>	lwzx r4,r3,r5 | 
|  | MOVHZ  (R3), R4		<=>	lhz r4,0(r3) | 
|  | MOVHU 2(R3), R4		<=>	lhau r4,2(r3) | 
|  | MOVBZ (R3), R4		<=>	lbz r4,0(r3) | 
|  |  | 
|  | MOVD R4,(R3)		<=>	std r4,0(r3) | 
|  | MOVW R4,(R3)		<=>	stw r4,0(r3) | 
|  | MOVW R4,(R3+R5)		<=>	stwx r4,r3,r5 | 
|  | MOVWU R4,4(R3)		<=>	stwu r4,4(r3) | 
|  | MOVH R4,2(R3)		<=>	sth r4,2(r3) | 
|  | MOVBU R4,(R3)(R5)		<=>	stbux r4,r3,r5 | 
|  |  | 
|  | 4. Compares | 
|  |  | 
|  | When an instruction does a compare or other operation that might | 
|  | result in a condition code, then the resulting condition is set | 
|  | in a field of the condition register. The condition register consists | 
|  | of 8 4-bit fields named CR0 - CR7. When a compare instruction | 
|  | identifies a CR then the resulting condition is set in that field | 
|  | to be read by a later branch or isel instruction. Within these fields, | 
|  | bits are set to indicate less than, greater than, or equal conditions. | 
|  |  | 
|  | Once an instruction sets a condition, then a subsequent branch, isel or | 
|  | other instruction can read the condition field and operate based on the | 
|  | bit settings. | 
|  |  | 
|  | Examples: | 
|  | CMP R3, R4			<=>	cmp r3, r4	(CR0 assumed) | 
|  | CMP R3, R4, CR1		<=>	cmp cr1, r3, r4 | 
|  |  | 
|  | Note that the condition register is the target operand of compare opcodes, so | 
|  | the remaining operands are in the same order for Go asm and PPC64 asm. | 
|  | When CR0 is used then it is implicit and does not need to be specified. | 
|  |  | 
|  | 5. Branches | 
|  |  | 
|  | Many branches are represented as a form of the BC instruction. There are | 
|  | other extended opcodes to make it easier to see what type of branch is being | 
|  | used. | 
|  |  | 
|  | The following is a brief description of the BC instruction and its commonly | 
|  | used operands. | 
|  |  | 
|  | BC op1, op2, op3 | 
|  |  | 
|  | op1: type of branch | 
|  | 16 -> bctr (branch on ctr) | 
|  | 12 -> bcr  (branch if cr bit is set) | 
|  | 8  -> bcr+bctr (branch on ctr and cr values) | 
|  | 4  -> bcr != 0 (branch if specified cr bit is not set) | 
|  |  | 
|  | There are more combinations but these are the most common. | 
|  |  | 
|  | op2: condition register field and condition bit | 
|  |  | 
|  | This contains an immediate value indicating which condition field | 
|  | to read and what bits to test. Each field is 4 bits long with CR0 | 
|  | at bit 0, CR1 at bit 4, etc. The value is computed as 4*CR+condition | 
|  | with these condition values: | 
|  |  | 
|  | 0 -> LT | 
|  | 1 -> GT | 
|  | 2 -> EQ | 
|  | 3 -> OVG | 
|  |  | 
|  | Thus 0 means test CR0 for LT, 5 means CR1 for GT, 30 means CR7 for EQ. | 
|  |  | 
|  | op3: branch target | 
|  |  | 
|  | Examples: | 
|  |  | 
|  | BC 12, 0, target		<=>	blt cr0, target | 
|  | BC 12, 2, target		<=>	beq cr0, target | 
|  | BC 12, 5, target		<=>	bgt cr1, target | 
|  | BC 12, 30, target		<=>	beq cr7, target | 
|  | BC 4, 6, target		<=>	bne cr1, target | 
|  | BC 4, 1, target		<=>	ble cr1, target | 
|  |  | 
|  | The following extended opcodes are available for ease of use and readability: | 
|  |  | 
|  | BNE CR2, target		<=>	bne cr2, target | 
|  | BEQ CR4, target		<=>	beq cr4, target | 
|  | BLT target			<=>	blt target (cr0 default) | 
|  | BGE CR7, target		<=>	bge cr7, target | 
|  |  | 
|  | Refer to the ISA for more information on additional values for the BC instruction, | 
|  | how to handle OVG information, and much more. | 
|  |  | 
|  | 5. Align directive | 
|  |  | 
|  | Starting with Go 1.12, Go asm supports the PCALIGN directive, which indicates | 
|  | that the next instruction should be aligned to the specified value. Currently | 
|  | 8 and 16 are the only supported values, and a maximum of 2 NOPs will be added | 
|  | to align the code. That means in the case where the code is aligned to 4 but | 
|  | PCALIGN $16 is at that location, the code will only be aligned to 8 to avoid | 
|  | adding 3 NOPs. | 
|  |  | 
|  | The purpose of this directive is to improve performance for cases like loops | 
|  | where better alignment (8 or 16 instead of 4) might be helpful. This directive | 
|  | exists in PPC64 assembler and is frequently used by PPC64 assembler writers. | 
|  |  | 
|  | PCALIGN $16 | 
|  | PCALIGN $8 | 
|  |  | 
|  | Functions in Go are aligned to 16 bytes, as is the case in all other compilers | 
|  | for PPC64. | 
|  |  | 
|  | Register naming | 
|  |  | 
|  | 1. Special register usage in Go asm | 
|  |  | 
|  | The following registers should not be modified by user Go assembler code. | 
|  |  | 
|  | R0: Go code expects this register to contain the value 0. | 
|  | R1: Stack pointer | 
|  | R2: TOC pointer when compiled with -shared or -dynlink (a.k.a position independent code) | 
|  | R13: TLS pointer | 
|  | R30: g (goroutine) | 
|  |  | 
|  | Register names: | 
|  |  | 
|  | Rn is used for general purpose registers. (0-31) | 
|  | Fn is used for floating point registers. (0-31) | 
|  | Vn is used for vector registers. Slot 0 of Vn overlaps with Fn. (0-31) | 
|  | VSn is used for vector-scalar registers. V0-V31 overlap with VS32-VS63. (0-63) | 
|  | CTR represents the count register. | 
|  | LR represents the link register. | 
|  |  | 
|  | */ | 
|  | package ppc64 |