| The Research Problems of Implementing Go |
| |
| Russ Cox |
| Google |
| |
| http://golang.org/ |
| |
| * About the Talk |
| |
| I gave this talk at Google's Cambridge, Massachusetts office at an event for area Ph.D. students. The purpose of the event and the talk was to give a sense of some of the research that goes on at Google. The talk presents some research questions motivated by Go. We have answered some well, but others remain open. |
| |
| * About Go |
| |
| Go is an open source programming language that makes it easy to build simple, reliable, and efficient software. |
| |
| Design began in late 2007. |
| |
| - Robert Griesemer, Rob Pike, Ken Thompson |
| - Russ Cox, Ian Lance Taylor |
| |
| Became open source in November 2009. |
| |
| Developed entirely in the open; very active community. |
| Language stable as of Go 1, early 2012. |
| Work continues. |
| |
| * Motivation for Go |
| |
| .image ../2012/splash/datacenter.jpg |
| |
| * Motivation for Go |
| |
| Started as an answer to software problems at Google: |
| |
| - multicore processors |
| - networked systems |
| - massive computation clusters |
| - scale: 10⁷ lines of code |
| - scale: 10³ programmers |
| - scale: 10⁶⁺ machines (design point) |
| |
| Deployed: parts of YouTube, dl.google.com, Blogger, Google Code, Google Fiber, ... |
| |
| * Go |
| |
| A simple but powerful and fun language. |
| |
| - start with C, remove complex parts |
| - add interfaces, concurrency |
| - also: garbage collection, closures, reflection, strings, ... |
| |
| For more background on design: |
| |
| - [[http://commandcenter.blogspot.com/2012/06/less-is-exponentially-more.html][Less is exponentially more]] |
| - [[http://talks.golang.org/2012/splash.article][Go at Google: Language Design in the Service of Software Engineering]] |
| |
| * Research and Go |
| |
| Go is designed for building production systems at Google. |
| |
| - Goal: make that job easier, faster, better. |
| - Non-goal: break new ground in programming language research |
| |
| Plenty of research questions about how to implement Go well. |
| |
| - Concurrency |
| - Polymorphism |
| - Garbage collection |
| - Program translation |
| |
| * Concurrency |
| |
| * Concurrency |
| |
| Go provides two important concepts: |
| |
| A goroutine is a thread of control within the program, with its own local variables and stack. Cheap, easy to create. |
| |
| A channel carries typed messages between goroutines. |
| |
| * Concurrency |
| |
| .play ../2013/distsys/hello.go |
| |
| * Concurrency: CSP |
| |
| Channels adopted from Hoare's Communicating Sequential Processes. |
| |
| - Orthogonal to rest of language |
| - Can keep familiar model for computation |
| - Focus on _composition_ of regular code |
| |
| Go _enables_ simple, safe concurrent programming. |
| It doesn't _forbid_ bad programming. |
| |
| Caveat: not purely memory safe; sharing is legal. |
| Passing a pointer over a channel is idiomatic. |
| |
| Experience shows this is practical. |
| |
| * Concurrency |
| |
| Sequential network address resolution, given a work list: |
| |
| .play ../2013/distsys/addr1.go /lookup/+1,/^}/-1 |
| |
| * Concurrency |
| |
| Parallel network address resolution, given a work list: |
| |
| .play ../2013/distsys/addr2.go /lookup/+1,/^}/-1 |
| |
| * Implementing Concurrency |
| |
| Challenge: Make channel communication scale |
| |
| - start with one global channel lock |
| - per-channel locks, locked in address order for multi-channel operations |
| |
| Research question: lock-free channels? |
| |
| * Polymorphism |
| |
| * Interfaces |
| |
| An interface defines a set of methods. |
| |
| package io |
| |
| type Writer interface { |
| Write(data []byte) (n int, err error) |
| } |
| |
| * Interfaces |
| |
| A type implements the interface by implementing the methods. |
| |
| package bytes |
| |
| type Buffer struct { |
| ... |
| } |
| |
| func (b *Buffer) Write(data []byte) (n int, err error) { |
| ... |
| } |
| |
| * Interfaces |
| |
| An implementation of an interface can be assigned to a variable of that interface type. |
| |
| package fmt |
| |
| func Fprintf(w io.Writer, format string, args ...interface{}) |
| |
| * Interfaces |
| |
| .play ../2013/distsys/writebuffer.go /^func.main/+1,/^}/-1 |
| |
| * Interface Advantages |
| |
| - no dependence between interface and implementation |
| - easy testing |
| - avoids overdesign, rigid hierarchy of inheritance-based OO |
| |
| The source of all generality in the Go language. |
| |
| * Implementing Interfaces |
| |
| How do you make method dispatch efficient? |
| |
| b := new(bytes.Buffer) |
| var w io.Writer |
| w = b |
| fmt.Fprintf(w, "hello, %s\n", "world") |
| ... w.Write(text) // what happens here? |
| |
| At w.Write call, how does the runtime find the method to call? |
| |
| * Implementing Interfaces |
| |
| How do you make method dispatch efficient? |
| |
| b := new(bytes.Buffer) |
| var w io.Writer |
| w = b // do the work here! |
| fmt.Fprintf(w, "hello, %s\n", "world") |
| ... w.Write(text) // plain indirect function call |
| |
| Interface holds two words: "itable" and actual value (or pointer to value). |
| |
| Itable contains type information plus list of function pointers for methods in interface. |
| |
| w.itable.fn[1](w.data, text) |
| |
| Conversion sites usually trivial to cache. |
| |
| * Interfaces for Algorithms |
| |
| package sort |
| |
| type Interface interface { |
| Len() int // return number of elements, len(x) |
| Less(i, j int) bool // report whether x[i] < x[j] |
| Swap(i, j int) // x[i], x[j] = x[j], x[i] |
| } |
| |
| func Sort(data Interface) |
| |
| Requires some boilerplate for each use: |
| |
| type bySubject []Thread |
| |
| func (x bySubject) Less(i, j int) bool { return x[i].Subject < x[j].Subject } |
| func (x bySubject) Swap(i, j int) { x[i], x[j] = x[j], x[i] } |
| func (x bySubject) Len() int { return len(x) } |
| |
| sort.Sort(bySubject(threads)) |
| |
| * Polymorphism: Can we do better? |
| |
| func Sort(data []T, less func(x, y *T) bool) |
| |
| sort.Sort(threads, func(x, y *Thread) bool { |
| return x.Subject < y.Subject |
| }) |
| |
| Research question: what's a reasonable semantics? |
| Research question: what's a reasonable implementation? |
| |
| - C says don't bother. |
| - C++ makes many copies of the same function. |
| - Java boxes everything implicitly: one function, but expensive data model. |
| - Java discards run-time type information. |
| |
| Do you want slow programmers, slow compilers and bloated binaries, or slow execution? |
| |
| * Garbage Collection |
| |
| * Garbage Collection |
| |
| Garbage collection simplifies APIs. |
| |
| - In C and C++, too much API design (and too much programming effort!) is about memory management. |
| |
| Fundamental to concurrency: too hard to track ownership otherwise. |
| |
| Fundamental to interfaces: memory management details do not bifurcate otherwise-similar APIs. |
| |
| Of course, adds cost, latency, complexity in run time system. |
| |
| * Avoiding Garbage Collection |
| |
| Observation: garbage collection is a service, and like any service it can be overloaded, oversubscribed. |
| |
| Go lets you limit allocation by controlling memory layout. |
| |
| type Point struct { |
| X, Y int |
| } |
| |
| type Rectangle struct { |
| Min, Max Point |
| } |
| |
| * Implementing Garbage Collection |
| |
| Language decision: interior pointers are allowed, as are foreign pointers |
| |
| - Cannot reuse Java GC algorithms directly. |
| - But gives _programmer_ more control over allocation. |
| |
| Allocator: objects are allocated in pages with other objects of the same size. |
| |
| Current GC: stop the world, parallel mark, start the world, background sweep. |
| |
| Research question: how to make collector lower latency, possibly incremental? |
| |
| * Program Translation |
| |
| * Program Translation |
| |
| Go programs can be parsed without context (unlike C and C++). |
| Go ships with a standard program formatter. |
| |
| Makes automated edits indistinguishable from manual edits. |
| |
| $ cat x.go |
| package main |
| |
| var b bytes.Buffer |
| |
| $ gofmt -r 'bytes.Buffer -> bytes.Writer' x.go |
| package main |
| |
| var b bytes.Writer |
| |
| $ |
| |
| More advanced rewrites: "go fix" for API adjustments. |
| |
| * Program Translation |
| |
| Research Question: What about translating other programs to Go? |
| |
| Exploring the conversion of C programs to Go today. |
| |
| - Decide return type (for example, int vs bool). |
| - Decide which variables are pointers vs arrays. |
| - Decide which functions are really methods. |
| - Decide natural package boundaries. |
| |
| What about other languages? |
| |
| * Research and Go |
| |
| Plenty of research questions about how to implement Go well. |
| |
| - Concurrency |
| - Polymorphism |
| - Garbage collection |
| - Program translation |
| |