blob: 51ae3b8fc703011111e8ed426d7da6ed87593660 [file] [log] [blame]
Cancellation, Context, and Plumbing
GothamGo 2014
Sameer Ajmani
sameer@golang.org
* Video
This talk was presented at GothamGo in New York City, November 2014.
.link http://vimeo.com/115309491 Watch the talk on Vimeo
* Introduction
In Go servers, each incoming request is handled in its own goroutine.
Handler code needs access to request-specific values:
- security credentials
- request deadline
- operation priority
When the request completes or times out, its work should be canceled.
* Cancellation
Abandon work when the caller no longer needs the result.
- user navigates away, closes connection
- operation exceeds its deadline
- when using hedged requests, cancel the laggards
Efficiently canceling unneeded work saves resources.
* Cancellation is advisory
Cancellation does not stop execution or trigger panics.
Cancellation informs code that its work is no longer needed.
Code checks for cancellation and decides what to do:
shut down, clean up, return errors.
* Cancellation is transitive
.image gotham-context/transitive.svg
* Cancellation affects all APIs on the request path
Network protocols support cancellation.
- HTTP: close the connection
- RPC: send a control message
APIs above network need cancellation, too.
- Database clients
- Network file system clients
- Cloud service clients
And all the layers atop those, up to the UI.
*Goal:* provide a uniform cancellation API that works across package boundaries.
* Cancellation APIs
Many Go APIs support cancellation and deadlines already.
Go APIs are synchronous, so cancellation comes from another goroutine.
Method on the connection or client object:
// goroutine #1
result, err := conn.Do(req)
// goroutine #2
conn.Cancel(req)
Method on the request object:
// goroutine #1
result, err := conn.Do(req)
// goroutine #2
req.Cancel()
* Cancellation APIs (continued)
Method on the pending result object:
// goroutine #1
pending := conn.Start(req)
...
result, err := pending.Result()
// goroutine #2
pending.Cancel()
Different cancellation APIs in each package are a headache.
We need one that's independent of package or transport:
// goroutine #1
result, err := conn.Do(x, req)
// goroutine #2
x.Cancel()
* Context
A `Context` carries a cancellation signal and request-scoped values to all functions running on behalf of the same task. It's safe for concurrent access.
.code gotham-context/interface.go /type Context/,/^}/
*Idiom:* pass `ctx` as the first argument to a function.
import "golang.org/x/net/context"
// ReadFile reads file name and returns its contents.
// If ctx.Done is closed, ReadFile returns ctx.Err immediately.
func ReadFile(ctx context.Context, name string) ([]byte, error)
Examples and discussion in [[http://blog.golang.org/context][blog.golang.org/context]].
* Contexts are hierarchical
`Context` has no `Cancel` method; obtain a cancelable `Context` using `WithCancel`:
.code gotham-context/interface.go /WithCancel /,/func WithCancel/
Passing a `Context` to a function does not pass the ability to cancel that `Context`.
// goroutine #1
ctx, cancel := context.WithCancel(parent)
...
data, err := ReadFile(ctx, name)
// goroutine #2
cancel()
Contexts form a tree, any subtree of which can be canceled.
* Why does Done return a channel?
Closing a channel works well as a broadcast signal.
_After_the_last_value_has_been_received_from_a_closed_channel_c,_any_receive_from_c_will_succeed_without_blocking,_returning_the_zero_value_for_the_channel_element._
Any number of goroutines can `select` on `<-ctx.Done()`.
Examples and discussion in in [[http://blog.golang.org/pipelines][blog.golang.org/pipelines]].
Using `close` requires care.
- closing a nil channel panics
- closing a closed channel panics
`Done` returns a receive-only channel that can only be canceled using the `cancel` function returned by `WithCancel`. It ensures the channel is closed exactly once.
* Context values
Contexts carry request-scoped values across API boundaries.
- deadline
- cancellation signal
- security credentials
- distributed trace IDs
- operation priority
- network QoS label
RPC clients encode `Context` values onto the wire.
RPC servers decode them into a new `Context` for the handler function.
* Replicated Search
Example from [[https://go.dev/talks/2012/concurrency.slide][Go Concurrency Patterns]].
.code gotham-context/first.go /START1/,/STOP1/
Remaining searches may continue running after First returns.
* Cancelable Search
.code gotham-context/first-context.go /START1/,/STOP1/
* Context plumbing
*Goal:* pass a `Context` parameter from each inbound RPC at a server through the call stack to each outgoing RPC.
.code gotham-context/before.go /START/,/END/
* Context plumbing (after)
.code gotham-context/after.go /START/,/END/
* Problem: Existing and future code
Google has millions of lines of Go code.
We've retrofitted the internal RPC and distributed file system APIs to take a Context.
Lots more to do, growing every day.
* Why not use (something like) thread local storage?
C++ and Java pass request state in thread-local storage.
Requires no API changes, but ...
requires custom thread and callback libraries.
Mostly works, except when it doesn't. Failures are hard to debug.
Serious consequences if credential-passing bugs affect user privacy.
"Goroutine-local storage" doesn't exist, and even if it did,
request processing may flow between goroutines via channels.
We won't sacrifice clarity for convenience.
* In Go, pass Context explicitly
Easy to tell when a Context passes between functions, goroutines, and processes.
Invest up front to make the system easier to maintain:
- update relevant functions to accept a `Context`
- update function calls to provide a `Context`
- update interface methods and implementations
Go's awesome tools can help.
* Automated refactoring
*Initial*State:*
Pass `context.TODO()` to outbound RPCs.
`context.TODO()` is a sentinel for static analysis tools. Use it wherever a `Context` is needed but there isn't one available.
*Iteration:*
For each function `F(x)` whose body contains `context.TODO()`,
- add a `Context` parameter to `F`
- update callers to use `F(context.TODO(),`x)`
- if the caller has a `Context` available, pass it to `F` instead
Repeat until `context.TODO()` is gone.
* Finding relevant functions
The [[http://godoc.org/golang.org/x/tools/cmd/callgraph][golang.org/x/tools/cmd/callgraph]] tool constructs the call graph of a Go program.
It uses whole-program pointer analysis to find dynamic calls (via interfaces or function values).
*For*context*plumbing:*
Find all functions on call paths from `Context` _suppliers_ (inbound RPCs) to `Context` _consumers_ (`context.TODO`).
* Updating function calls
To change add all `F(x)` to `F(context.TODO(),`x)`:
- define `FContext(ctx,`x)`
- `F(x)` → `FContext(context.TODO(),`x)`
- change `F(x)` to `F(ctx,`x)`
- `FContext(context.TODO(),`x)` `F(context.TODO(),`x)`
- remove `FContext(ctx,`x)`
* gofmt -r
Works well for simple replacements:
gofmt -r 'pkg.F(a) -> pkg.FContext(context.TODO(), a)'
But this is too imprecise for methods. There may be many methods named M:
gofmt -r 'x.M(y) -> x.MContext(context.TODO(), y)'
We want to restrict the transformation to specific method signatures.
* The eg tool
The [[http://godoc.org/golang.org/x/tools/cmd/eg][golang.org/x/tools/cmd/eg]] tool performs precise example-based refactoring.
The `before` expression specifies a pattern and the `after` expression its replacement.
To replace `x.M(y)` with `x.MContext(context.TODO(),`y)`:
.code gotham-context/eg.go
* Dealing with interfaces
We need to update dynamic calls to `x.M(y)`.
If `M` called via interface `I`, then `I.M` also needs to change. The eg tool can update call sites with receiver type `I`.
When we change `I`, we need to update all of its implementations.
Find types assignable to `I` using [[http://godoc.org/golang.org/x/tools/go/types][golang.org/x/tools/go/types]].
More to do here.
* What about the standard library?
The Go 1.0 compatibility guarantee means we will not break existing code.
Interfaces like `io.Reader` and `io.Writer` are widely used.
For Google files, used a currying approach:
f, err := file.Open(ctx, "/gfs/cell/path")
...
fio := f.IO(ctx) // returns an io.ReadWriteCloser that passes ctx
data, err := ioutil.ReadAll(fio)
For versioned public packages, add `Context` parameters in a new API version and provide `eg` templates to insert `context.TODO()`.
More to do here.
* Conclusion
Cancellation needs a uniform API across package boundaries.
Retrofitting code is hard, but Go is tool-friendly.
New code should use `Context`.
Links:
- [[http://golang.org/x/net/context][golang.org/x/net/context]] - package
- [[http://blog.golang.org/context][blog.golang.org/context]] - blog post
- [[http://golang.org/x/tools/cmd/eg][golang.org/x/tools/cmd/eg]] - eg tool