| Cancellation, Context, and Plumbing |
| GothamGo 2014 |
| |
| Sameer Ajmani |
| sameer@golang.org |
| |
| * Video |
| |
| This talk was presented at GothamGo in New York City, November 2014. |
| |
| .link http://vimeo.com/115309491 Watch the talk on Vimeo |
| |
| * Introduction |
| |
| In Go servers, each incoming request is handled in its own goroutine. |
| |
| Handler code needs access to request-specific values: |
| |
| - security credentials |
| - request deadline |
| - operation priority |
| |
| When the request completes or times out, its work should be canceled. |
| |
| * Cancellation |
| |
| Abandon work when the caller no longer needs the result. |
| |
| - user navigates away, closes connection |
| - operation exceeds its deadline |
| - when using hedged requests, cancel the laggards |
| |
| Efficiently canceling unneeded work saves resources. |
| |
| * Cancellation is advisory |
| |
| Cancellation does not stop execution or trigger panics. |
| |
| Cancellation informs code that its work is no longer needed. |
| |
| Code checks for cancellation and decides what to do: |
| shut down, clean up, return errors. |
| |
| * Cancellation is transitive |
| |
| .image gotham-context/transitive.svg |
| |
| * Cancellation affects all APIs on the request path |
| |
| Network protocols support cancellation. |
| |
| - HTTP: close the connection |
| - RPC: send a control message |
| |
| APIs above network need cancellation, too. |
| |
| - Database clients |
| - Network file system clients |
| - Cloud service clients |
| |
| And all the layers atop those, up to the UI. |
| |
| *Goal:* provide a uniform cancellation API that works across package boundaries. |
| |
| * Cancellation APIs |
| |
| Many Go APIs support cancellation and deadlines already. |
| |
| Go APIs are synchronous, so cancellation comes from another goroutine. |
| |
| Method on the connection or client object: |
| |
| // goroutine #1 |
| result, err := conn.Do(req) |
| |
| // goroutine #2 |
| conn.Cancel(req) |
| |
| Method on the request object: |
| |
| // goroutine #1 |
| result, err := conn.Do(req) |
| |
| // goroutine #2 |
| req.Cancel() |
| |
| * Cancellation APIs (continued) |
| |
| Method on the pending result object: |
| |
| // goroutine #1 |
| pending := conn.Start(req) |
| ... |
| result, err := pending.Result() |
| |
| // goroutine #2 |
| pending.Cancel() |
| |
| |
| Different cancellation APIs in each package are a headache. |
| |
| We need one that's independent of package or transport: |
| |
| // goroutine #1 |
| result, err := conn.Do(x, req) |
| |
| // goroutine #2 |
| x.Cancel() |
| |
| * Context |
| |
| A `Context` carries a cancellation signal and request-scoped values to all functions running on behalf of the same task. It's safe for concurrent access. |
| |
| .code gotham-context/interface.go /type Context/,/^}/ |
| |
| *Idiom:* pass `ctx` as the first argument to a function. |
| |
| import "golang.org/x/net/context" |
| |
| // ReadFile reads file name and returns its contents. |
| // If ctx.Done is closed, ReadFile returns ctx.Err immediately. |
| func ReadFile(ctx context.Context, name string) ([]byte, error) |
| |
| Examples and discussion in [[http://blog.golang.org/context][blog.golang.org/context]]. |
| |
| * Contexts are hierarchical |
| |
| `Context` has no `Cancel` method; obtain a cancelable `Context` using `WithCancel`: |
| |
| .code gotham-context/interface.go /WithCancel /,/func WithCancel/ |
| |
| Passing a `Context` to a function does not pass the ability to cancel that `Context`. |
| |
| // goroutine #1 |
| ctx, cancel := context.WithCancel(parent) |
| ... |
| data, err := ReadFile(ctx, name) |
| |
| // goroutine #2 |
| cancel() |
| |
| Contexts form a tree, any subtree of which can be canceled. |
| |
| * Why does Done return a channel? |
| |
| Closing a channel works well as a broadcast signal. |
| |
| _After_the_last_value_has_been_received_from_a_closed_channel_c,_any_receive_from_c_will_succeed_without_blocking,_returning_the_zero_value_for_the_channel_element._ |
| |
| Any number of goroutines can `select` on `<-ctx.Done()`. |
| |
| Examples and discussion in in [[http://blog.golang.org/pipelines][blog.golang.org/pipelines]]. |
| |
| Using `close` requires care. |
| |
| - closing a nil channel panics |
| - closing a closed channel panics |
| |
| `Done` returns a receive-only channel that can only be canceled using the `cancel` function returned by `WithCancel`. It ensures the channel is closed exactly once. |
| |
| * Context values |
| |
| Contexts carry request-scoped values across API boundaries. |
| |
| - deadline |
| - cancellation signal |
| - security credentials |
| - distributed trace IDs |
| - operation priority |
| - network QoS label |
| |
| RPC clients encode `Context` values onto the wire. |
| |
| RPC servers decode them into a new `Context` for the handler function. |
| |
| * Replicated Search |
| |
| Example from [[https://go.dev/talks/2012/concurrency.slide][Go Concurrency Patterns]]. |
| |
| .code gotham-context/first.go /START1/,/STOP1/ |
| |
| Remaining searches may continue running after First returns. |
| |
| * Cancelable Search |
| |
| .code gotham-context/first-context.go /START1/,/STOP1/ |
| |
| * Context plumbing |
| |
| *Goal:* pass a `Context` parameter from each inbound RPC at a server through the call stack to each outgoing RPC. |
| |
| .code gotham-context/before.go /START/,/END/ |
| |
| * Context plumbing (after) |
| |
| .code gotham-context/after.go /START/,/END/ |
| |
| * Problem: Existing and future code |
| |
| Google has millions of lines of Go code. |
| |
| We've retrofitted the internal RPC and distributed file system APIs to take a Context. |
| |
| Lots more to do, growing every day. |
| |
| * Why not use (something like) thread local storage? |
| |
| C++ and Java pass request state in thread-local storage. |
| |
| Requires no API changes, but ... |
| requires custom thread and callback libraries. |
| |
| Mostly works, except when it doesn't. Failures are hard to debug. |
| |
| Serious consequences if credential-passing bugs affect user privacy. |
| |
| "Goroutine-local storage" doesn't exist, and even if it did, |
| request processing may flow between goroutines via channels. |
| |
| We won't sacrifice clarity for convenience. |
| |
| * In Go, pass Context explicitly |
| |
| Easy to tell when a Context passes between functions, goroutines, and processes. |
| |
| Invest up front to make the system easier to maintain: |
| |
| - update relevant functions to accept a `Context` |
| - update function calls to provide a `Context` |
| - update interface methods and implementations |
| |
| Go's awesome tools can help. |
| |
| * Automated refactoring |
| |
| *Initial*State:* |
| |
| Pass `context.TODO()` to outbound RPCs. |
| |
| `context.TODO()` is a sentinel for static analysis tools. Use it wherever a `Context` is needed but there isn't one available. |
| |
| *Iteration:* |
| |
| For each function `F(x)` whose body contains `context.TODO()`, |
| |
| - add a `Context` parameter to `F` |
| - update callers to use `F(context.TODO(),`x)` |
| - if the caller has a `Context` available, pass it to `F` instead |
| |
| Repeat until `context.TODO()` is gone. |
| |
| * Finding relevant functions |
| |
| The [[http://godoc.org/golang.org/x/tools/cmd/callgraph][golang.org/x/tools/cmd/callgraph]] tool constructs the call graph of a Go program. |
| |
| It uses whole-program pointer analysis to find dynamic calls (via interfaces or function values). |
| |
| *For*context*plumbing:* |
| |
| Find all functions on call paths from `Context` _suppliers_ (inbound RPCs) to `Context` _consumers_ (`context.TODO`). |
| |
| * Updating function calls |
| |
| To change add all `F(x)` to `F(context.TODO(),`x)`: |
| |
| - define `FContext(ctx,`x)` |
| - `F(x)` → `FContext(context.TODO(),`x)` |
| - change `F(x)` to `F(ctx,`x)` |
| - `FContext(context.TODO(),`x)` → `F(context.TODO(),`x)` |
| - remove `FContext(ctx,`x)` |
| |
| * gofmt -r |
| |
| Works well for simple replacements: |
| |
| gofmt -r 'pkg.F(a) -> pkg.FContext(context.TODO(), a)' |
| |
| But this is too imprecise for methods. There may be many methods named M: |
| |
| gofmt -r 'x.M(y) -> x.MContext(context.TODO(), y)' |
| |
| We want to restrict the transformation to specific method signatures. |
| |
| * The eg tool |
| |
| The [[http://godoc.org/golang.org/x/tools/cmd/eg][golang.org/x/tools/cmd/eg]] tool performs precise example-based refactoring. |
| |
| The `before` expression specifies a pattern and the `after` expression its replacement. |
| |
| To replace `x.M(y)` with `x.MContext(context.TODO(),`y)`: |
| |
| .code gotham-context/eg.go |
| |
| * Dealing with interfaces |
| |
| We need to update dynamic calls to `x.M(y)`. |
| |
| If `M` called via interface `I`, then `I.M` also needs to change. The eg tool can update call sites with receiver type `I`. |
| |
| When we change `I`, we need to update all of its implementations. |
| |
| Find types assignable to `I` using [[http://godoc.org/golang.org/x/tools/go/types][golang.org/x/tools/go/types]]. |
| |
| More to do here. |
| |
| * What about the standard library? |
| |
| The Go 1.0 compatibility guarantee means we will not break existing code. |
| |
| Interfaces like `io.Reader` and `io.Writer` are widely used. |
| |
| For Google files, used a currying approach: |
| |
| f, err := file.Open(ctx, "/gfs/cell/path") |
| ... |
| fio := f.IO(ctx) // returns an io.ReadWriteCloser that passes ctx |
| data, err := ioutil.ReadAll(fio) |
| |
| For versioned public packages, add `Context` parameters in a new API version and provide `eg` templates to insert `context.TODO()`. |
| |
| More to do here. |
| |
| * Conclusion |
| |
| Cancellation needs a uniform API across package boundaries. |
| |
| Retrofitting code is hard, but Go is tool-friendly. |
| |
| New code should use `Context`. |
| |
| Links: |
| |
| - [[http://golang.org/x/net/context][golang.org/x/net/context]] - package |
| - [[http://blog.golang.org/context][blog.golang.org/context]] - blog post |
| - [[http://golang.org/x/tools/cmd/eg][golang.org/x/tools/cmd/eg]] - eg tool |