| Go in Go |
| Gopherfest |
| 26 May 2015 |
| |
| Rob Pike |
| Google |
| r@golang.org |
| http://golang.org/ |
| |
| * Go in Go |
| |
| As of the 1.5 release of Go, the entire system is now written in Go. |
| (And a little assembler.) |
| |
| C is gone. |
| |
| Side note: `gccgo` is still going strong. |
| This talk is about the original compiler, `gc`. |
| |
| * Why was it in C? |
| |
| Bootstrapping. |
| |
| (Also Go was not intended primarily as a compiler implementation language.) |
| |
| * Why move the compiler to Go? |
| |
| Not for validation; we have more pragmatic motives: |
| |
| - Go is easier to write (correctly) than C. |
| - Go is easier to debug than C (even absent a debugger). |
| - Go is the only language you'd need to know; encourages contributions. |
| - Go has better modularity, tooling, testing, profiling, ... |
| - Go makes parallel execution trivial. |
| |
| Already seeing benefits, and it's early yet. |
| |
| Design document: [[http://golang.org/s/go13compiler]] |
| |
| * Why move the runtime to Go? |
| |
| We had our own C compiler just to compile the runtime. |
| We needed a compiler with the same ABI as Go, such as segmented stacks. |
| |
| Switching it to Go means we can get rid of the C compiler. |
| That's more important than converting the compiler to Go. |
| |
| (All the reasons for moving the compiler apply to the runtime as well.) |
| |
| Now only one language in the runtime; easier integration, stack management, etc. |
| |
| |
| As always, simplicity is the overriding consideration. |
| |
| * History |
| |
| Why do we have our own tool chain at all? |
| Our own ABI? |
| Our own file formats? |
| |
| History, familiarity, and ease of moving forward. And speed. |
| |
| Many of Go's big changes would be much harder with GCC or LLVM. |
| |
| .link https://news.ycombinator.com/item?id=8817990 |
| |
| * Big changes |
| |
| All made easier by owning the tools and/or moving to Go: |
| |
| - linker rearchitecture |
| - new garbage collector |
| - stack maps |
| - contiguous stacks |
| - write barriers |
| |
| The last three are all but impossible in C: |
| |
| - C is not type safe; don't always know what's a pointer |
| - aliasing of stack slots caused by optimization |
| |
| (`Gccgo` will have segmented stacks and imprecise (stack) collection for a while yet.) |
| |
| * Goroutine stacks |
| |
| - Until 1.2: Stacks were segmented. |
| - 1.3: Stacks were contiguous unless executing C code (runtime). |
| - 1.4: Stacks made contiguous by restricting C to system stack. |
| - 1.5: Stacks made contiguous by eliminating C. |
| |
| These were each huge steps, made quickly (led by `khr@`). |
| |
| * Converting the runtime |
| |
| Mostly done by hand with machine assistance. |
| |
| Challenge to implement the runtime in a safe language. |
| Some use of `unsafe` to deal with pointers as raw bits in the GC, for instance. |
| But less than you might think. |
| |
| The translator (next sections) helped for some of the translation. |
| |
| * Converting the compiler |
| |
| Why translate it, not write it from scratch? Correctness, testing. |
| |
| Steps: |
| |
| - Write a custom translator from C to Go. |
| - Run the translator, iterate until success. |
| - Measure success by bit-identical output. |
| - Clean up the code by hand and by machine. |
| - Turn it from C-in-Go to idiomatic Go (still happening). |
| |
| * Translator |
| |
| First output was C line-by-line translated to (bad!) Go. |
| Tool to do this written by `rsc@` (talked about at GopherCon 2014). |
| Custom written for this job, not a general C-to-Go translator. |
| |
| Steps: |
| |
| - Parse C code using new simple C parser (`yacc`) |
| - Remove or rewrite C-isms such as `*p++` as an expression |
| - Walk the C parse tree, print the C code in Go syntax |
| - Compile the output |
| - Run, compare generated code |
| - Repeat |
| |
| The `Yacc` grammar was translated by sam-powered hands. |
| |
| * Translator configuration |
| |
| Aided by hand-written rewrite rules, such as: |
| |
| - this field is a bool |
| - this function returns a bool |
| |
| Also diff-like rewrites for things such as using the standard library: |
| |
| diff { |
| - g.Rpo = obj.Calloc(g.Num*sizeof(g.Rpo[0]), 1).([]*Flow) |
| - idom = obj.Calloc(g.Num*sizeof(idom[0]), 1).([]int32) |
| - if g.Rpo == nil || idom == nil { |
| - Fatal("out of memory") |
| - } |
| + g.Rpo = make([]*Flow, g.Num) |
| + idom = make([]int32, g.Num) |
| } |
| |
| * Another example |
| |
| This one due to semantic difference between the languages. |
| |
| diff { |
| - if nreg == 64 { |
| - mask = ^0 // can't rely on C to shift by 64 |
| - } else { |
| - mask = (1 << uint(nreg)) - 1 |
| - } |
| + mask = (1 << uint(nreg)) - 1 |
| } |
| |
| * Grind |
| |
| Once in Go, new tool `grind` deployed (by `rsc@`): |
| |
| - parses Go, type checks |
| - records a list of edits to perform: "insert this text at this position" |
| - at end, applies edits to source (hard to edit AST). |
| |
| Changes guided by profiling and other analysis: |
| |
| - removes dead code |
| - removes gotos |
| - removes unused labels, needless indirections, etc. |
| - moves `var` declarations nearer to first use |
| |
| .link http://rsc.io/grind |
| |
| * Performance problems |
| |
| Output from translator was poor Go, and ran about 10X slower. |
| Most of that slowdown has been recovered. |
| |
| Problems with C to Go: |
| |
| - C patterns can be poor Go; e.g.: complex `for` loops |
| - C stack variables never escape; Go compiler isn't as sure |
| - interfaces such as `fmt.Stringer` vs. C's `varargs` |
| - no `unions` in Go, so use `structs` instead: bloat |
| - variable declarations in wrong place |
| |
| C compiler didn't free much memory, but Go has a GC. |
| Adds CPU and memory overhead. |
| |
| * Performance fixes |
| |
| Profile! (Never done before!) |
| |
| - move `vars` closer to first use |
| - split `vars` into multiple |
| - replace code in the compiler with code in the library: e.g. `math/big` |
| - use interface or other tricks to combine `struct` fields |
| - better escape analysis (`drchase@`). |
| - hand tuning code and data layout |
| |
| Use tools like `grind`, `gofmt` `-r` and `eg` for much of this. |
| |
| Removing interface argument from a debugging print library got 15% overall! |
| |
| More remains to be done. |
| |
| * Technical benefits |
| |
| Other benefits of the conversion: |
| |
| Garbage collection means no more worry about introducing a dangling pointer. |
| |
| Chance to clean up the back ends. |
| |
| Unified `386` and `amd64` architectures throughout the tool chain. |
| |
| New architectures are easier to add. |
| |
| Unified the tools: now one compiler, one assembler, one linker. |
| |
| * Compiler |
| |
| `GOOS=YYY` `GOARCH=XXX` `go` `tool` `compile` |
| |
| One compiler; no more `6g`, `8g` etc. |
| |
| About 50K lines of portable code. |
| Even the registerizer is portable now; architectures well characterized. |
| Non-portable: Peepholing, details like registers bound to instructions. |
| Typically around 10% of the portable LOC. |
| |
| * Assembler |
| |
| `GOOS=YYY` `GOARCH=XXX` `go` `tool` `asm` |
| |
| New assembler, all in Go, written from scratch by `r@`. |
| Clean, idiomatic Go code. |
| |
| Less than 4000 lines, <10% machine-dependent. |
| |
| Almost completely compatible with previous `yacc` and C assemblers. |
| |
| How is this possible? |
| |
| - shared syntax originating in the Plan 9 assemblers |
| - unified back-end logic (old `liblink`, now `internal/obj`) |
| |
| * Linker |
| |
| `GOOS=YYY` `GOARCH=XXX` `go` `tool` `link` |
| |
| Mostly hand- and machine- translated from C code. |
| |
| New library, `internal/obj`, part of original linker, captures details about machines, writes object files. |
| |
| 27000 lines summed across 4 architectures, mostly tables (plus some ugliness). |
| |
| - `arm`: 4000 |
| - `arm64`: 6000 |
| - `ppc64`: 5000 |
| - `x86`: 7500 (`386` and `amd64`) |
| |
| Example benefit: one print routine to print any instruction for any architecture. |
| |
| * Bootstrap |
| |
| With no C compiler, bootstrapping requires a Go compiler. |
| |
| Therefore need to build or download a working Go installation to build 1.5 from source. |
| |
| We use Go 1.4+ as the base to build the 1.5+ tool chain. (Newer is OK too.) |
| |
| Details: [[http://golang.org/s/go15bootstrap]] |
| |
| * Future |
| |
| Much work still to do, but 1.5 is mostly set. |
| |
| Future work: |
| |
| Better escape analysis. |
| New compiler back end using SSA (much easier in Go than C). |
| Will allow much more optimization. |
| |
| Generate machine descriptions from PDFs (or maybe XML). |
| Will have a purely machine-generated instruction definition: |
| "Read in PDF, write out an assembler configuration". |
| Already deployed for the disassemblers. |
| |
| * Conclusions |
| |
| Getting rid of C was a huge advance for the project. |
| Code is cleaner, testable, profilable, easier to work on. |
| |
| New unified tool chain reduces code size, increases maintainability. |
| |
| Flexible tool chain, portability still paramount. |
| |
| |