| Go, for Distributed Systems |
| |
| Russ Cox |
| Google |
| |
| * About the Talk |
| |
| I gave variants of this talk three times in 2013, once at SOSP's Programming Languages and Operating Systems (PLOS) workshop, once at MIT Lincoln Lab's annual Software Engineering Symposium, and once at Twitter's Cambridge, Massachusetts office. |
| |
| The talk assumes an audience familiar with the basic problems of building distributed systems. It presents Go's approach to solving some of those problems. |
| |
| * Go |
| |
| Go is an open source programming language that makes it easy to build simple, reliable, and efficient software. |
| |
| * History |
| |
| Design began in late 2007. |
| |
| - Robert Griesemer, Rob Pike, Ken Thompson |
| - Ian Lance Taylor, Russ Cox |
| |
| Became open source in November 2009. |
| |
| Developed entirely in the open; very active community. |
| Language stable as of Go 1, early 2012. |
| |
| * Motivation |
| |
| Started as an answer to software problems at Google: |
| |
| - multicore processors |
| - networked systems |
| - massive computation clusters |
| - scale: 10⁷ lines of code |
| - scale: 10³ programmers |
| - scale: 10⁶⁺ machines (design point) |
| |
| * Go |
| |
| A simple but powerful and fun language. |
| |
| - start with C, remove complex parts |
| - add interfaces, concurrency |
| - also: garbage collection, closures, reflection, strings, ... |
| |
| For more background on design: |
| |
| - [[http://commandcenter.blogspot.com/2012/06/less-is-exponentially-more.html][Less is exponentially more]] |
| - [[http://go.dev/talks/2012/splash.article][Go at Google: Language Design in the Service of Software Engineering]] |
| |
| * This Talk |
| |
| Engineering |
| |
| Interfaces |
| |
| Concurrency |
| |
| * Engineering |
| |
| * Engineering: Imports |
| |
| .play distsys/hello0.go |
| |
| import "fmt" guaranteed to read exactly one file. |
| |
| * Engineering: Imports |
| |
| $ go get github.com/golang/glog |
| |
| .play distsys/hello1.go |
| |
| Still guaranteed to read exactly one file. |
| |
| Import name space is decentralized. |
| |
| * Engineering: Program Rewrites |
| |
| $ gofmt -r 'glog.Infof -> glog.Errorf' hello1.go |
| package main |
| |
| import ( |
| "flag" |
| "github.com/golang/glog" |
| ) |
| |
| func main() { |
| flag.Set("logtostderr", "true") |
| glog.Errorf("hello, world") |
| } |
| $ |
| |
| * Engineering: Garbage Collection |
| |
| In C and C++, too much programming _and_ API design is about memory management. |
| |
| Go has garbage collection, only. |
| |
| Fundamental for interfaces: memory management details do not bifurcate otherwise-similar APIs. |
| |
| Fundamental for concurrency: too hard to track ownership otherwise. |
| |
| Of course, adds cost, latency, complexity in run time system. |
| |
| * Engineering: Garbage Collection |
| |
| Experience with Java: Uncontrollable cost, too much tuning. |
| |
| Go lets you limit allocation by controlling memory layout. |
| |
| Examples: |
| |
| type Ring struct { |
| R, W int |
| Data [512]byte |
| } |
| |
| type Point struct { |
| X, Y int |
| } |
| |
| type Rectangle struct { |
| Min, Max Point |
| } |
| |
| * Engineering: Garbage Collection |
| |
| Garbage collector implementation remains an active area of work and research. |
| |
| Design decision: Interior pointers are allowed, as are foreign pointers. |
| |
| - Cannot reuse Java GC algorithms directly. |
| - But gives _programmer_ more control over allocation. |
| |
| Current design: parallel mark-and-sweep. |
| With care to use memory wisely, works well in production. |
| |
| * Interfaces |
| |
| * Interfaces |
| |
| An interface defines a set of methods. |
| |
| package io |
| |
| type Writer interface { |
| Write(data []byte) (n int, err error) |
| } |
| |
| * Interfaces |
| |
| A type implements the interface by implementing the methods. |
| |
| package bytes |
| |
| type Buffer struct { |
| ... |
| } |
| |
| func (b *Buffer) Write(data []byte) (n int, err error) { |
| ... |
| } |
| |
| * Interfaces |
| |
| An implementation of an interface can be assigned to a variable of that interface type. |
| |
| package fmt |
| |
| func Fprintf(w io.Writer, format string, args ...interface{}) |
| |
| * Interfaces |
| |
| .play distsys/writebuffer.go /^func.main/+1,/^}/-1 |
| |
| * Interfaces |
| |
| Reader is the obvious counterpart. |
| |
| package io |
| |
| type Reader interface { |
| Read(data []byte) (n int, err error) |
| } |
| |
| func Copy(dst Writer, src Reader) (n int64, err error) |
| |
| * Interfaces |
| |
| .play distsys/writebuffer2.go /^func.main/+1,/^}/-1 |
| |
| * Interfaces |
| |
| Reader and Writer turn out to be very useful. |
| |
| Adapters |
| |
| package io |
| |
| func MultiWriter(writers ...Writer) Writer |
| MultiWriter creates a writer that duplicates its writes to all the |
| provided writers, similar to the Unix tee(1) command. |
| |
| Chaining |
| |
| package gzip // compress/gzip |
| |
| func NewWriter(w io.Writer) *Writer |
| |
| Also: buffered writers, encrypted writers, limited writers, HTTP responses. |
| |
| * Interfaces |
| |
| Networking: |
| |
| package net |
| |
| type Conn interface { |
| Read(b []byte) (n int, err error) |
| Write(b []byte) (n int, err error) |
| Close() error |
| |
| LocalAddr() Addr |
| RemoteAddr() Addr |
| |
| SetDeadline(t time.Time) error |
| SetReadDeadline(t time.Time) error |
| SetWriteDeadline(t time.Time) error |
| } |
| |
| func Dial(network, address string) (Conn, error) |
| |
| * Interfaces |
| |
| Networking example: |
| |
| .play distsys/finger.go /^func.finger/+1,/^}/-1 |
| |
| * Interfaces |
| |
| Networking client as adapter function: |
| |
| package smtp |
| |
| func NewClient(conn net.Conn, host string) (*Client, error) |
| |
| Other implementations of net.Conn: testing, SSL, ... |
| |
| * Interface Lessons |
| |
| Key advantages: |
| |
| - no dependence between interface and implementation |
| - expressive composition |
| - easy testing |
| - avoids overdesign, rigid hierarchy of inheritance-based OO |
| |
| The source of all generality in the Go language. |
| |
| * Concurrency |
| |
| * Concurrency vs Parallelism |
| |
| Concurrency is about dealing with lots of things at once. |
| |
| Parallelism is about doing lots of things at once. |
| |
| Concurrency is about structure, parallelism is about execution. |
| |
| Concurrency provides a way to structure a solution to solve a problem that may be parallelizable (or not). |
| |
| * Concurrency vs Parallelism |
| |
| Concurrent: mouse, keyboard, display, and disk drivers in operating system. |
| |
| Parallel: vector dot product, matrix multiply. |
| |
| Concurrency can enable parallelism but is useful on its own: modern programs must deal with many things at once. |
| |
| * Concurrency |
| |
| Go provides two important concepts: |
| |
| A goroutine is a thread of control within the program, with its own local variables and stack. Cheap, easy to create. |
| |
| A channel carries typed messages between goroutines. |
| |
| * Concurrency |
| |
| .play distsys/hello.go |
| |
| * Concurrency: CSP |
| |
| Channels adopted from Hoare's Communicating Sequential Processes. |
| |
| - Orthogonal to rest of language |
| - Can keep familiar model for computation |
| - Focus on _composition_ of regular code |
| |
| Go _enables_ simple, safe concurrent programming. |
| It doesn't _forbid_ bad programming. |
| |
| Caveat: not purely memory safe; sharing is legal. |
| Passing a pointer over a channel is idiomatic. |
| |
| Experience shows this is practical. |
| |
| * Concurrency |
| |
| Sequential network address resolution, given a work list: |
| |
| .play distsys/addr1.go /lookup/+1,/^}/-1 |
| |
| * Concurrency |
| |
| Parallel network address resolution, given a work list: |
| |
| .play distsys/addr2.go /lookup/+1,/^}/-1 |
| |
| * Concurrency |
| |
| Aside: can abstract this pattern. |
| |
| .play distsys/addr3.go /lookup/+1,/^}/-1 |
| |
| * Concurrency |
| |
| Aside: can abstract this pattern further (hypothetical): |
| |
| var par ParallelDo |
| |
| for _, w := range worklist { |
| w := w // copy iteration variable |
| par.Do(func() { |
| w.addrs, w.err = net.LookupHost(w.host) |
| }) |
| ) |
| |
| par.Wait() |
| |
| But it's still useful to be able to construct alternate patterns. |
| |
| * Concurrency |
| |
| Bounded parallelism: |
| |
| .play distsys/addr4.go /lookup/+1,/^}/-1 |
| |
| * Concurrency |
| |
| Bounded parallelism, 2: |
| |
| .play distsys/addr5.go /lookup/+1,/^}/-1 |
| |
| * Concurrency: |
| |
| Aside: can abstract (still hypothetical): |
| |
| par.Limit(10) |
| |
| for _, w := range work { |
| w := w // copy iteration variable |
| par.Do(func() { |
| w.addrs, w.err = net.LookupHost(w.host) |
| }) |
| ) |
| |
| par.Wait() |
| |
| * Concurrency |
| |
| Example: replicated storage with read and write quorums. |
| |
| const ( |
| F = 2 |
| N = 5 // >= 2F + 1 |
| ReadQuorum = F + 1 |
| WriteQuorum = N - F |
| ) |
| |
| * Concurrency |
| |
| Replicated write, returning after enough writes have succeeded. |
| |
| .play distsys/replwrite.go /^func.Write/+2,/if.delay/-2/ |
| |
| * Concurrency |
| |
| Replicated read, returning after enough reads have been gathered. |
| |
| .play distsys/replread.go /^func.Read/+2,/if.delay/-2/ |
| |
| * Concurrency |
| |
| Select allows choosing between multiple channel operations. |
| |
| Example, chat program: |
| |
| for { |
| select { |
| case event := <-ui: |
| // process user interface event |
| case msg := <-server: |
| // process server message |
| case t := <-tick: |
| // time has elapsed |
| } |
| } |
| |
| * Concurrency Lessons |
| |
| - Key feature for building distributed systems. |
| - Supported by closures and garbage collection. |
| - Message passing inside program, also outside program. |
| |
| Most important: |
| |
| - Do not communicate by sharing memory. |
| - Instead, share memory by communicating. |
| |
| * Production Use |
| |
| vitess/vtocc, MySQL query balancer |
| |
| - serves all of YouTube's MySQL queries |
| - months of crash-free and leak-free operation |
| |
| groupcache |
| |
| - distributed in-memory immutable key-value cache |
| - used by (parts of) dl.google.com, Blogger, Google Code, Google Fiber, production monitoring systems |
| |
| * More information and related talks |
| |
| http://golang.org/ |
| |
| rsc@golang.org |
| |
| Videos: |
| |
| - Concurrency is not parallelism [[http://vimeo.com/49718712][video]] |
| - Go concurrency patterns [[http://www.youtube.com/watch?v=f6kdp27TYZs][video]] |
| - Advanced Go concurrency patterns [[http://www.youtube.com/watch?v=QDDwwePbDtw][video]] |
| |
| Questions? |