blob: aebaf61b4a5081cc881b2a539a89c1c1415aa179 [file] [log] [blame]
Cancelation, Context, and Plumbing
GothamGo 2014
Sameer Ajmani
* Video
This talk was presented at GothamGo in New York City, November 2014.
.link Watch the talk on Vimeo
* Introduction
In Go servers, each incoming request is handled in its own goroutine.
Handler code needs access to request-specific values:
- security credentials
- request deadline
- operation priority
When the request completes or times out, its work should be canceled.
* Cancelation
Abandon work when the caller no longer needs the result.
- user navigates away, closes connection
- operation exceeds its deadline
- when using hedged requests, cancel the laggards
Efficiently canceling unneeded work saves resources.
* Cancelation is advisory
Cancelation does not stop execution or trigger panics.
Cancelation informs code that its work is no longer needed.
Code checks for cancelation and decides what to do:
shut down, clean up, return errors.
* Cancelation is transitive
.image gotham-context/transitive.svg
* Cancelation affects all APIs on the request path
Network protocols support cancelation.
- HTTP: close the connection
- RPC: send a control message
APIs above network need cancelation, too.
- Database clients
- Network file system clients
- Cloud service clients
And all the layers atop those, up to the UI.
*Goal:* provide a uniform cancelation API that works across package boundaries.
* Cancelation APIs
Many Go APIs support cancelation and deadlines already.
Go APIs are synchronous, so cancelation comes from another goroutine.
Method on the connection or client object:
// goroutine #1
result, err := conn.Do(req)
// goroutine #2
Method on the request object:
// goroutine #1
result, err := conn.Do(req)
// goroutine #2
* Cancelation APIs (continued)
Method on the pending result object:
// goroutine #1
pending := conn.Start(req)
result, err := pending.Result()
// goroutine #2
Different cancelation APIs in each package are a headache.
We need one that's independent of package or transport:
// goroutine #1
result, err := conn.Do(x, req)
// goroutine #2
* Context
A `Context` carries a cancelation signal and request-scoped values to all functions running on behalf of the same task. It's safe for concurrent access.
.code gotham-context/interface.go /type Context/,/^}/
*Idiom:* pass `ctx` as the first argument to a function.
import ""
// ReadFile reads file name and returns its contents.
// If ctx.Done is closed, ReadFile returns ctx.Err immediately.
func ReadFile(ctx context.Context, name string) ([]byte, error)
Examples and discussion in [[][]].
* Contexts are hierarchical
`Context` has no `Cancel` method; obtain a cancelable `Context` using `WithCancel`:
.code gotham-context/interface.go /WithCancel /,/func WithCancel/
Passing a `Context` to a function does not pass the ability to cancel that `Context`.
// goroutine #1
ctx, cancel := context.WithCancel(parent)
data, err := ReadFile(ctx, name)
// goroutine #2
Contexts form a tree, any subtree of which can be canceled.
* Why does Done return a channel?
Closing a channel works well as a broadcast signal.
Any number of goroutines can `select` on `<-ctx.Done()`.
Examples and discussion in in [[][]].
Using `close` requires care.
- closing a nil channel panics
- closing a closed channel panics
`Done` returns a receive-only channel that can only be canceled using the `cancel` function returned by `WithCancel`. It ensures the channel is closed exactly once.
* Context values
Contexts carry request-scoped values across API boundaries.
- deadline
- cancelation signal
- security credentials
- distributed trace IDs
- operation priority
- network QoS label
RPC clients encode `Context` values onto the wire.
RPC servers decode them into a new `Context` for the handler function.
* Replicated Search
Example from [[][Go Concurrency Patterns]].
.code gotham-context/first.go /START1/,/STOP1/
Remaining searches may continue running after First returns.
* Cancelable Search
.code gotham-context/first-context.go /START1/,/STOP1/
* Context plumbing
*Goal:* pass a `Context` parameter from each inbound RPC at a server through the call stack to each outgoing RPC.
.code gotham-context/before.go /START/,/END/
* Context plumbing (after)
.code gotham-context/after.go /START/,/END/
* Problem: Existing and future code
Google has millions of lines of Go code.
We've retrofitted the internal RPC and distributed file system APIs to take a Context.
Lots more to do, growing every day.
* Why not use (something like) thread local storage?
C++ and Java pass request state in thread-local storage.
Requires no API changes, but ...
requires custom thread and callback libraries.
Mostly works, except when it doesn't. Failures are hard to debug.
Serious consequences if credential-passing bugs affect user privacy.
"Goroutine-local storage" doesn't exist, and even if it did,
request processing may flow between goroutines via channels.
We won't sacrifice clarity for convenience.
* In Go, pass Context explicitly
Easy to tell when a Context passes between functions, goroutines, and processes.
Invest up front to make the system easier to maintain:
- update relevant functions to accept a `Context`
- update function calls to provide a `Context`
- update interface methods and implementations
Go's awesome tools can help.
* Automated refactoring
Pass `context.TODO()` to outbound RPCs.
`context.TODO()` is a sentinel for static analysis tools. Use it wherever a `Context` is needed but there isn't one available.
For each function `F(x)` whose body contains `context.TODO()`,
- add a `Context` parameter to `F`
- update callers to use `F(context.TODO(),`x)`
- if the caller has a `Context` available, pass it to `F` instead
Repeat until `context.TODO()` is gone.
* Finding relevant functions
The [[][]] tool constructs the call graph of a Go program.
It uses whole-program pointer analysis to find dynamic calls (via interfaces or function values).
Find all functions on call paths from `Context` _suppliers_ (inbound RPCs) to `Context` _consumers_ (`context.TODO`).
* Updating function calls
To change add all `F(x)` to `F(context.TODO(),`x)`:
- define `FContext(ctx,`x)`
- `F(x)` → `FContext(context.TODO(),`x)`
- change `F(x)` to `F(ctx,`x)`
- `FContext(context.TODO(),`x)` `F(context.TODO(),`x)`
- remove `FContext(ctx,`x)`
* gofmt -r
Works well for simple replacements:
gofmt -r 'pkg.F(a) -> pkg.FContext(context.TODO(), a)'
But this is too imprecise for methods. There may be many methods named M:
gofmt -r 'x.M(y) -> x.MContext(context.TODO(), y)'
We want to restrict the transformation to specific method signatures.
* The eg tool
The [[][]] tool performs precise example-based refactoring.
The `before` expression specifies a pattern and the `after` expression its replacement.
To replace `x.M(y)` with `x.MContext(context.TODO(),`y)`:
.code gotham-context/eg.go
* Dealing with interfaces
We need to update dynamic calls to `x.M(y)`.
If `M` called via interface `I`, then `I.M` also needs to change. The eg tool can update call sites with receiver type `I`.
When we change `I`, we need to update all of its implementations.
Find types assignable to `I` using [[][]].
More to do here.
* What about the standard library?
The Go 1.0 compatibility guarantee means we will not break existing code.
Interfaces like `io.Reader` and `io.Writer` are widely used.
For Google files, used a currying approach:
f, err := file.Open(ctx, "/gfs/cell/path")
fio := f.IO(ctx) // returns an io.ReadWriteCloser that passes ctx
data, err := ioutil.ReadAll(fio)
For versioned public packages, add `Context` parameters in a new API version and provide `eg` templates to insert `context.TODO()`.
More to do here.
* Conclusion
Cancelation needs a uniform API across package boundaries.
Retrofitting code is hard, but Go is tool-friendly.
New code should use `Context`.
- [[][]] - package
- [[][]] - blog post
- [[][]] - eg tool