| <!--- | 
 | // Copyright 2018 The Go Authors. All rights reserved. | 
 | // Use of this source code is governed by a BSD-style | 
 | // license that can be found in the LICENSE file. | 
 | --> | 
 |  | 
 | ## Introduction to the Go compiler | 
 |  | 
 | `cmd/compile` contains the main packages that form the Go compiler. The compiler | 
 | may be logically split in four phases, which we will briefly describe alongside | 
 | the list of packages that contain their code. | 
 |  | 
 | You may sometimes hear the terms "front-end" and "back-end" when referring to | 
 | the compiler. Roughly speaking, these translate to the first two and last two | 
 | phases we are going to list here. A third term, "middle-end", often refers to | 
 | much of the work that happens in the second phase. | 
 |  | 
 | Note that the `go/*` family of packages, such as `go/parser` and | 
 | `go/types`, are mostly unused by the compiler. Since the compiler was | 
 | initially written in C, the `go/*` packages were developed to enable | 
 | writing tools working with Go code, such as `gofmt` and `vet`. | 
 | However, over time the compiler's internal APIs have slowly evolved to | 
 | be more familiar to users of the `go/*` packages. | 
 |  | 
 | It should be clarified that the name "gc" stands for "Go compiler", and has | 
 | little to do with uppercase "GC", which stands for garbage collection. | 
 |  | 
 | ### 1. Parsing | 
 |  | 
 | * `cmd/compile/internal/syntax` (lexer, parser, syntax tree) | 
 |  | 
 | In the first phase of compilation, source code is tokenized (lexical analysis), | 
 | parsed (syntax analysis), and a syntax tree is constructed for each source | 
 | file. | 
 |  | 
 | Each syntax tree is an exact representation of the respective source file, with | 
 | nodes corresponding to the various elements of the source such as expressions, | 
 | declarations, and statements. The syntax tree also includes position information | 
 | which is used for error reporting and the creation of debugging information. | 
 |  | 
 | ### 2. Type checking | 
 |  | 
 | * `cmd/compile/internal/types2` (type checking) | 
 |  | 
 | The types2 package is a port of `go/types` to use the syntax package's | 
 | AST instead of `go/ast`. | 
 |  | 
 | ### 3. IR construction ("noding") | 
 |  | 
 | * `cmd/compile/internal/types` (compiler types) | 
 | * `cmd/compile/internal/ir` (compiler AST) | 
 | * `cmd/compile/internal/typecheck` (AST transformations) | 
 | * `cmd/compile/internal/noder` (create compiler AST) | 
 |  | 
 | The compiler middle end uses its own AST definition and representation of Go | 
 | types carried over from when it was written in C. All of its code is written in | 
 | terms of these, so the next step after type checking is to convert the syntax | 
 | and types2 representations to ir and types. This process is referred to as | 
 | "noding." | 
 |  | 
 | There are currently two noding implementations: | 
 |  | 
 | 1. irgen (aka "-G=3" or sometimes "noder2") is the implementation used starting | 
 |    with Go 1.18, and | 
 |  | 
 | 2. Unified IR is another, in-development implementation (enabled with | 
 |    `GOEXPERIMENT=unified`), which also implements import/export and inlining. | 
 |  | 
 | Up through Go 1.18, there was a third noding implementation (just | 
 | "noder" or "-G=0"), which directly converted the pre-type-checked | 
 | syntax representation into IR and then invoked package typecheck's | 
 | type checker. This implementation was removed after Go 1.18, so now | 
 | package typecheck is only used for IR transformations. | 
 |  | 
 | ### 4. Middle end | 
 |  | 
 | * `cmd/compile/internal/deadcode` (dead code elimination) | 
 | * `cmd/compile/internal/inline` (function call inlining) | 
 | * `cmd/compile/internal/devirtualize` (devirtualization of known interface method calls) | 
 | * `cmd/compile/internal/escape` (escape analysis) | 
 |  | 
 | Several optimization passes are performed on the IR representation: | 
 | dead code elimination, (early) devirtualization, function call | 
 | inlining, and escape analysis. | 
 |  | 
 | ### 5. Walk | 
 |  | 
 | * `cmd/compile/internal/walk` (order of evaluation, desugaring) | 
 |  | 
 | The final pass over the IR representation is "walk," which serves two purposes: | 
 |  | 
 | 1. It decomposes complex statements into individual, simpler statements, | 
 |    introducing temporary variables and respecting order of evaluation. This step | 
 |    is also referred to as "order." | 
 |  | 
 | 2. It desugars higher-level Go constructs into more primitive ones. For example, | 
 |    `switch` statements are turned into binary search or jump tables, and | 
 |    operations on maps and channels are replaced with runtime calls. | 
 |  | 
 | ### 6. Generic SSA | 
 |  | 
 | * `cmd/compile/internal/ssa` (SSA passes and rules) | 
 | * `cmd/compile/internal/ssagen` (converting IR to SSA) | 
 |  | 
 | In this phase, IR is converted into Static Single Assignment (SSA) form, a | 
 | lower-level intermediate representation with specific properties that make it | 
 | easier to implement optimizations and to eventually generate machine code from | 
 | it. | 
 |  | 
 | During this conversion, function intrinsics are applied. These are special | 
 | functions that the compiler has been taught to replace with heavily optimized | 
 | code on a case-by-case basis. | 
 |  | 
 | Certain nodes are also lowered into simpler components during the AST to SSA | 
 | conversion, so that the rest of the compiler can work with them. For instance, | 
 | the copy builtin is replaced by memory moves, and range loops are rewritten into | 
 | for loops. Some of these currently happen before the conversion to SSA due to | 
 | historical reasons, but the long-term plan is to move all of them here. | 
 |  | 
 | Then, a series of machine-independent passes and rules are applied. These do not | 
 | concern any single computer architecture, and thus run on all `GOARCH` variants. | 
 | These passes include dead code elimination, removal of | 
 | unneeded nil checks, and removal of unused branches. The generic rewrite rules | 
 | mainly concern expressions, such as replacing some expressions with constant | 
 | values, and optimizing multiplications and float operations. | 
 |  | 
 | ### 7. Generating machine code | 
 |  | 
 | * `cmd/compile/internal/ssa` (SSA lowering and arch-specific passes) | 
 | * `cmd/internal/obj` (machine code generation) | 
 |  | 
 | The machine-dependent phase of the compiler begins with the "lower" pass, which | 
 | rewrites generic values into their machine-specific variants. For example, on | 
 | amd64 memory operands are possible, so many load-store operations may be combined. | 
 |  | 
 | Note that the lower pass runs all machine-specific rewrite rules, and thus it | 
 | currently applies lots of optimizations too. | 
 |  | 
 | Once the SSA has been "lowered" and is more specific to the target architecture, | 
 | the final code optimization passes are run. This includes yet another dead code | 
 | elimination pass, moving values closer to their uses, the removal of local | 
 | variables that are never read from, and register allocation. | 
 |  | 
 | Other important pieces of work done as part of this step include stack frame | 
 | layout, which assigns stack offsets to local variables, and pointer liveness | 
 | analysis, which computes which on-stack pointers are live at each GC safe point. | 
 |  | 
 | At the end of the SSA generation phase, Go functions have been transformed into | 
 | a series of obj.Prog instructions. These are passed to the assembler | 
 | (`cmd/internal/obj`), which turns them into machine code and writes out the | 
 | final object file. The object file will also contain reflect data, export data, | 
 | and debugging information. | 
 |  | 
 | ### Further reading | 
 |  | 
 | To dig deeper into how the SSA package works, including its passes and rules, | 
 | head to [cmd/compile/internal/ssa/README.md](internal/ssa/README.md). |