Author(s): Austin Clements
Last updated: 2019-01-14
Discussion at https://golang.org/issue/27539.
Go‘s current calling convention interferes with several significant optimizations, such as register passing (a potential 5% win). Despite the obvious appeal of these optimizations, we’ve encountered significant roadblocks to their implementation. While Go‘s calling convention isn’t covered by the Go 1 compatibility promise, it‘s impossible to write Go assembly code without depending on it, and there are many important packages that use Go assembly. As a result, much of Go’s calling convention is effectively public and must be maintained in a backwards-compatible way.
We propose a way forward based on having multiple calling conventions. We propose maintaining the existing calling convention and introducing a new, private calling convention that is explicitly not backwards-compatible and not accessible to assembly code, with a mechanism to keep different calling convention transparently inter-operable. This same mechanism can be used to introduce other public, stable calling conventions in the future, but the details of that are outside the scope of this proposal.
This proposal is not about any specific new calling convention. It's about enabling new calling conventions to work in the existing Go ecosystem. This is one step in a longer-term plan.
Language environments depend on application binary interfaces (ABIs) to define the machine-level conventions for operating within that environment. One key aspect of an ABI is the calling convention, which defines how function calls in the language operate at a machine-code level.
Go's calling convention specifies how functions pass argument values and results (on the stack), which registers have fixed functions (e.g., R10 on ARM is the “g” register) or may be clobbered by a call (all non-fixed function registers), and how to interact with stack growth, the scheduler, and the garbage collector.
Go's calling convention as of Go 1.11 is simple and nearly universal across platforms, but also inefficient and inflexible. It is rife with opportunities for improving performance. For example, experiments with passing arguments and results in registers suggest a 5% performance win. Propagating register clobbers up the call graph could avoid unnecessary stack spills. Keeping the stack bound in a fixed register could eliminate two dependent memory loads on every function entry on x86. Passing dynamic allocation scopes could reduce heap allocations.
And yet, even though the calling convention is invisible to Go programs, almost every substantive change we‘ve attempted has been stymied because changes break existing Go assembly code. While there’s relatively little Go assembly (roughly 170 kLOC in public GitHub repositories*), it tends to lie at the heart of important packages like crypto and numerical libraries.
This proposal operates within two key constraints:
We can‘t break existing assembly code, even though it isn’t technically covered by Go 1 compatibility. There‘s too much of it and it’s too important. Hence, we can't change the calling convention used by existing assembly code.
We can‘t depend on a transition periods after which existing assembly would break. Too much code simply doesn’t get updated, or if it does, it doesn‘t get re-vendored. Hence, it’s not enough to give people a transition path to a new calling convention and some time. Existing code must continue to work.
This proposal resolves this tension by introducing multiple calling conventions. Initially, we propose two: one is stable, documented, and codifies the rules of the current calling convention; the other is unstable, internal, and may change from release to release.
* This counts non-comment, non-whitespace lines of code in unique files. It excludes vendored source and source with a “Go Authors” copyright notice.
We propose introducing a second calling convention.
ABI0
is the current calling convention, which passes arguments and results on the stack, clobbers all registers on calls, and has a few platform-dependent fixed registers.
ABIInternal
is unstable and may change from release to release. Initially, it will be identical to ABI0
, but ABIInternal
opens the door for changes.
Once we're happy with ABIInternal
, we may “snapshot” it as a new stable ABI1
, allowing assembly code to be written against the presumably faster, new calling convention. This would not eliminate ABIInternal
, as ABIInternal
could later diverge from ABI1
, though ABI1
and ABIInternal
may be identical for some time.
A text symbol can provide different definitions for different ABIs. One of these will be the “native” implementation—ABIInternal
for functions defined in Go and ABI0
for functions defined in assembly—while the others will be “ABI wrappers” that simply translate to the ABI of the native implementation and call it. In the linker, each symbol is already identified with a (name, version) pair. The implementation will simply map ABIs to linker symbol versions.
All functions defined in Go will be natively ABIInternal
, and the Go compiler will assume all functions provide an ABIInternal
implementation. Hence, all cross-package calls and all indirect calls (closure calls and interface method calls) will use ABIInternal
. If the native implementation of the called function is ABI0
, this will call a wrapper, which will call the ABI0
implementation. For direct calls, if the compiler knows the target is a native ABI0
function, it can optimize that call to use ABI0
directly, but this is strictly an optimization.
All functions defined in assembly will be natively ABI0
, and all references to text symbols from assembly will use the ABI0
definition. To introduce another stable ABI in the future, we would extend the assembly symbol syntax with a way to specify the ABI, but ABI0
must be assumed for all unqualified symbols for backwards compatibility.
In order to transparently bridge the two (or more) ABIs, we will extend the assembler with a mode to scan for all text symbol definitions and references in assembly code, and report these to the compiler. When these symbols are referenced or defined, respectively, from Go code in the same package, the compiler will use the type information available in Go declarations and function stubs to produce the necessary ABI wrapper definitions.
The linker will check that all symbol references use the correct ABI and ultimately keep everything honest.
The above approach allows us to introduce an internal calling convention without any modifications to any safe Go code, or the vast majority of assembly-using packages. This is largely afforded by the extra build step that scans for assembly symbol definitions and references.
There are two major trade-off axes that lead to different designs.
Rather than implicitly scanning assembly code for symbol definitions and references, we could instead introduce pragma comments that users could use to explicitly inform the compiler of symbol ABIs. This would make these ABI boundaries evident in code, but would likely break many more existing packages.
In order to keep any assembly-using packages working as-is, this approach would need default rules. For example, body-less function stubs would likely need to default to ABI0
. Any Go functions called from assembly would still need explicit annotations, though such calls are rare. This would cover most assembly-using packages, but function stubs are also used for Go symbols pushed across package boundaries using //go:linkname
. For link-named symbols, a pragma would be necessary to undo the default ABI0
behavior, and would depend on how the target function was implemented.
Ultimately, there's no set of default rules that keeps all existing code working. Hence, this design proposes extracting symbols from assembly source to derive the correct ABIs in the vast majority of cases.
In this proposal, a single function can provide multiple entry-points for different calling conventions. One of these is the “native” implementation and the others are intended to translate the calling convention and then invoke the native implementation.
An alternative would be for each function to provide a single calling convention and require all calls to that function to follow that calling convention. Other languages use this approach, such as C (e.g., fastcall
/stdcall
/cdecl
) and Rust (extern "C"
, etc). This works well for direct calls, but for direct calls it‘s also possible to compile away this proposal’s ABI wrapper. However, it dramatically complicates indirect calls since it requires the calling convention to become part of the type. Hence, in Go, we would either have to extend the type system, or declare that only ABIInternal
functions can be used in closures and interface satisfaction, both of which are less than ideal.
Using ABI wrappers has the added advantage that calls to a Go function from Go can use the fastest available ABI, while still allowing calls via the stable ABI from assembly.
Finally, there's flexibility in this design around when exactly to generate ABI wrappers. In the current proposal, ABI wrappers are always generated in the package where both the definition and the reference to a symbol appear. However, ABI wrappers can be generated anywhere Go type information is available.
For example, the compiler could generate an ABIInternal
→ABI0
wrapper when an ABI0
function is stored in a closure or method table, regardless of which package that happens in. And the compiler could generate an ABI0
→ABIInternal
wrapper when it encounters an ABI0
reference from assembly by finding the function's type either in the current package or via export info from another package.
This proposed change does not affect the functioning of any safe Go code. It can affect code that goes outside the compatibility guidelines, but is designed to minimize this impact. Specifically:
Unsafe Go code can observe the calling convention, though doing so requires violating even the allowed uses of unsafe.Pointer. This does arise in the internal implementation of the runtime and in cgo, both of which will have to be adjusted when we actually change the calling convention.
Cross-package references where the definition and the reference are different ABIs may no longer link.
There are various ways to form cross-package references in Go, though all depends on //go:linkname
(which is explicitly unsafe) or complicated assembly symbol naming. Specifically, the following types of cross-package references may no longer link:
In this table “push” refers to a symbol that is implemented in one package, but its symbol name places it in a different package. In Go this is accomplished with //go:linkname
and in assembly this is accomplished by explicitly specifying the package in a symbol name. There are a total of two instances of “asm+push” on all of public GitHub, both of which are already broken under current rules.
“Go+pull” refers to when an unexported symbol defined in one package is referenced from another package via //go:linkname
. “asm+xref” refers to any cross-package symbol reference from assembly. The vast majority of “asm+xref” references in public GitHub repositories are to a small set of runtime package functions like entersyscall
, exitsyscall
, and memmove
. These are serious abstraction violations, but they're also easy to keep working.
There are two general groups of link failures in the above table, indicated by superscripts.
In group 1, the compiler will create an ABIInternal
reference to a symbol that may only provide an ABI0
implementation. This can be worked-around by ensuring there's a Go function stub for the symbol in the defining package. For “asm” definitions this is usually the case anyway, and “asm+push” definitions do not happen in practice outside the runtime. In all of these cases, type information is available at the reference site, so the compiler could record assembly ABI definitions in the export info and produce the stubs in the referencing package, assuming the defining package is imported.
In group 2, the assembler will create an ABI0
reference to a symbol that may only provide an ABIInternal
implementation. In general, calls from assembly to Go are quite rare because they require either stack maps for the assembly code, or for the Go function and everything it calls recursively to be //go:nosplit
(which is, in general, not possible to guarantee because of compiler-inserted calls). This can be worked-around by creating a dummy reference from assembly in the defining package. For “asm+xref” references to exported symbols, it would be possible to address this transparently by using export info to construct the ABI wrapper when compiling the referer package, again assuming the defining package is imported.
The situations that cause these link failures are vanishingly rare in public code corpora (outside of the standard library itself), all depend on unsafe code, and all have reasonable workarounds. Hence, we conclude that the potential compatibility issues created by this proposal are worth the upsides.
One compatibility issue we found in public GitHub repositories was references from assembly to runtime.panic*
functions. These calls to an unexported function are an obvious violation of modularity, but also a violation of the Go ABI because the callers invariably lack a stack map. If a stack growth or GC were to happen during this call, it would result in a fatal panic.
In these cases, we recommend wrapping the assembly function in a Go function that performs the necessary checks and then calls the assembly function. Typically, this Go function will be inlined into its caller, so this will not introduce additional call overhead.
For example, take a function that computes the pair-wise sums of two slices and requires its arguments to be the same length:
// func AddVecs(x, y []float64) TEXT ·AddVecs(SB), NOSPLIT, $16 // ... check lengths, put panic message on stack ... CALL runtime·panic(SB)
This should instead be written as a Go function that uses language facilities to panic, followed by a call to the assembly implementation that implements the operation:
func AddVecs(x, y []float64) {
if len(x) != len(y) {
panic("slices must be the same length")
}
addVecsAsm(x, y)
}
In this example, AddVecs
is small enough that it will be inlined, so there's no additional overhead.
Austin Clements will implement this proposal for Go 1.12. This will allow the ABI split to soak for a release while the two calling conventions are in fact identical. Assuming that goes well, we can move on to changing the internal calling convention in Go 1.13.
Since both calling conventions will initially be identical, the implementation will initially use “ABI aliases” rather than full ABI wrappers. ABI aliases will be fully resolved by the Go linker, so in the final binary every symbol will still have one implementation and all calls (regardless of call ABI) will resolve to that implementation.
The rough implementation steps are as follows:
Reserve space in the linker's symbol version numbering to represent symbol ABIs. Currently, all non-static symbols have version 0, so any linker code that depends on this will need to be updated.
Add a -gensymabis
flag to cmd/asm
that scans assembly sources for text symbol definitions and references and produces a “symbol ABIs” file rather than assembling the code.
Add a -symabis
flag to cmd/compile
that accepts this symbol ABIs file.
Update cmd/go
, cmd/dist
, and any other mini-build systems in the standard tree to invoke asm
in -gensymabis
mode and feed the result to compile
.
Add support for recording symbol ABIs and ABI alias symbols to the object file format.
Modify cmd/link
to resolve ABI aliases.
Modify cmd/compile
to produce ABIInternal
symbols for all Go functions, produce ABIInternal
→ABI0
ABI aliases for Go functions referenced from assembly, and produce ABI0
→ABIInternal
ABI aliases for assembly functions referenced from Go.
Once we‘re ready to modify the internal calling convention, the first step will be to produce actual ABI wrappers. We’ll then likely want to start with a simple change, such as putting the stack bound in a fixed register.
There are a few open issues in this proposal.
How should tools that render symbols from object files (e.g., nm
and objdump
) display symbol ABIs? With ABI aliases, there‘s little need to show this (though it can affect how a symbol is resolved), but with full ABI wrappers it will become more pressing. Ideally this would be done in a way that doesn’t significantly clutter the output.
How do we represent symbols with different ABI entry-points in platform object files, particularly in shared objects? In the initial implementation using ABI aliases, we can simply erase the ABI. It may be that we need to use minor name mangling to encode the symbol ABI in its name (though this does not have to affect the Go symbol name).
How should ABI wrappers and go:nosplit
interact? In general, the wrapper needs to be go:nosplit
if and only if the wrapped function is go:nosplit
. However, for assembly functions, the wrapper is generated by the compiler and the compiler doesn't currently know whether the assembly function is go:nosplit
. It could conservatively make wrappers for assembly functions go:nosplit
, or the toolchain could include that information in the symabis file.