blob: a0669217ce51fdba21b770264b220b670ade3120 [file] [log] [blame] [view]
# Proposal: Secure the Public Go Module Ecosystem with the Go Notary
Russ Cox\
Filippo Valsorda
Last updated: February 26, 2019.
[golang.org/design/25530-notary](https://golang.org/design/25530-notary)
Discussion at [golang.org/issue/25530](https://golang.org/issue/25530).
## Abstract
We propose to secure the public Go module ecosystem
by introducing a new server, the Go notary,
which serves what is in effect a `go.sum` file
listing all publicly-available Go modules.
The `go` command will use this service to fill in gaps
in its own local `go.sum` files,
such as during `go get -u`.
This ensures that unexpected code changes cannot
be introduced when first adding a dependency to a module
or when upgrading a dependency.
## Background
When you run `go` `get` `rsc.io/quote@v1.5.2`, `go` `get` first fetches
`https://rsc.io/quote?go-get=1` and looks for `<meta>` tags. It finds
<meta name="go-import"
content="rsc.io/quote git https://github.com/rsc/quote">
which tells the code is in a Git repository on `github.com`.
Next it runs `git clone https://github.com/rsc/quote` to fetch
the Git repository and then extracts the file tree from the `v1.5.2` tag,
producing the actual module archive.
Historically, `go` `get` has always simply assumed that it was downloading
the right code.
An attacker able to intercept the connection to `rsc.io` or `github.com`
(or an attacker able to break into one of those systems, or a malicious module author)
would be able to cause `go` `get` to download different code tomorrow,
and `go` `get` would not notice.
There are
[many challenges in using software dependencies safely](https://research.swtch.com/deps),
and much more vetting should typically be done before taking on a
new dependency, but no amount of vetting is worth anything
if the code you download and vet today
differs from the code you or a collaborator downloads
tomorrow for the same module version.
We must be able to authenticate whether a particular
download is correct.
For our purposes, correct for a particular module version download
is defined as the same code everyone else downloads.
This definition ensures reproducibility of builds
and makes vetting of specific module versions meaningful,
without needing to attribute specific archives to
specific authors,
and without introducing new potential points of compromise
like per-author keys.
(Also, even the author of a module should not be able to change
the bits associated with a specific version from one day to the next.)
Being able to authenticate a particular module version download
effectively moves code hosting servers like `rsc.io` and `github.com`
out of the trusted computing base of the Go module ecosystem.
With module authentication, those servers could cause availability problems
by not serving a module version anymore,
but they cannot substitute different code.
The introduction of Go module proxies (see `go help goproxy`)
introduces yet another way for an attacker to intercept module downloads;
module authentication eliminates the need to trust those proxies as well,
moving them outside
[trusted computing base](https://www.microsoft.com/en-us/research/publication/authentication-in-distributed-systems-theory-and-practice/).
See the Go blog post “[Go Modules in 2019](https://blog.golang.org/modules2019)”
for additional background.
### Module Authentication with `go.sum`
Go 1.11s preview of Go modules introduced the `go.sum` file,
which is maintained automatically by the `go` command
in the root of a module tree
and contains cryptographic checksums for the content of each
dependency of that module.
If a modules source file tree is obtained unmodified,
then the `go.sum` file allows authenticating all dependencies
needed for a build of that module.
It ensures that tomorrows builds will use the same exact
code for dependencies that todays builds did.
Tomorrows downloads are authenticated by `go.sum`.
On the other hand, todays downloadsthe ones that add or update
dependencies in the first placeare not authenticated.
When a dependency is first added to a module,
or when a dependency is upgraded to a newer version,
there is no entry for it in `go.sum`,
and the `go` command today blindly trusts that it
downloads the correct code.
Then it records the hash of that code into `go.sum`
to ensure that code doesnt change tomorrow.
But that doesnt help the initial download.
The model is similar to SSHs
“[trust on first use](https://en.wikipedia.org/wiki/Trust_on_first_use),”
and while that approach is an improvement over trust every time,”
its still not ideal,
especially since developers typically download new module versions
far more often than they connect to new, unknown SSH servers.
We are concerned primarily with authenticating downloads
of publicly-available module versions.
We assume that the private servers hosting
private module source code are already within the
trusted computing base of the developers using that code.
In contrast, a developer who wants to use `rsc.io/quote`
should not be required to trust that `rsc.io` is properly secured.
This trust becomes particularly problematic when summed
over all dependencies.
What we need is an easily-accessed `go.sum` file listing every
publicly-available module version.
But we dont want to blindly trust a downloaded `go.sum` file,
since that would become the next attractive target for an attacker.
### Transparent Logs
The [Certificate Transparency](https://www.certificate-transparency.org/) project
is based on a data structure called a _transparent _log_.
The transparent log is hosted on a server and made accessible to clients for random access,
but clients are still able to verify that a particular log record really is in the log
and also that the server never removes any log record from the log.
Separately, third-party auditors can iterate over the log
checking that the entries themselves are accurate.
These two properties combined mean that
a client can use records from the log,
confident that those records will remain available in the log
for auditors to double-check and report invalid or suspicious entries.
Clients and auditors can also compare observations to ensure
that the server is showing the same data to everyone involved.
That is, the log server is not trusted to store the log properly,
nor is it trusted to put the right records into the log.
Instead, clients and auditors interact skeptically with the server,
able to verify for themselves in each interaction
that the server really is behaving correctly.
For details about the data structure, see Russ Coxs blog post,
“[Transparent Logs for Skeptical Clients](https://research.swtch.com/tlog).”
The use of a transparent log for module hashes aligns with
a broader trend of using transparent logs to enable detection
of misbehavior by partially trusted systems,
what the Trillian team calls
“[General Transparency](https://github.com/google/trillian/#trillian-general-transparency).”
## Proposal
We propose to publish the `go.sum` lines for all publicly-available Go modules
in a transparent log,
served by a new server called the Go notary.
When a publicly-available module is not yet listed in
the main modules `go.sum` file,
the `go` command will fetch the relevant `go.sum` lines
from the notary instead of trusting the initial download
to be correct.
### Notary Server
The Go notary will run at `https://notary.golang.org/` and serve the following endpoints:
- `/latest` will serve a signed tree size and hash for the latest log.
- `/lookup/M@V` will serve the log record number for the entry about module M version V,
followed by the data for the record.
If the module version is not yet recorded in the log, the notary will try to fetch it before replying.
Note that the data should never be used without first
authenticating it against a signed tree hash.
- `/record/R` will serve the data for record number R.
- `/tile/H/L/K[.p/W]` will serve a [log tile](https://research.swtch.com/tlog#serving_tiles).
The optional `.p/W` suffix indicates a partial log tile with only `W` hashes.
### Proxying a Notary
A module proxy can also proxy requests to the notary.
The general proxy URL form is `<proxyURL>/notary/<notaryURL>`.
If `GOPROXY=https://proxy.site` then the latest signed tree would be fetched using
`https://proxy.site/notary/notary.golang.org/latest`.
Including the full notary URL allows a transition to a new notary log,
such as `notary.golang.org/v2`.
Before accessing any notary URL using a proxy,
the proxy client should first fetch `<proxyURL>/notary/supported`.
If that request returns a successful (HTTP 200) response,
then the proxy supports proxying notary requests.
In that case, the client should use the proxied notary only,
never falling back to a direct connection to the notary.
If the `/notary/supported` check fails with a not found (HTTP 404) response,
the proxy is unwilling to proxy the notary,
and the client should connect directly to the notary.
Any other response is treated as the notary being unavailable.
A corporate proxy may want to ensure that clients
never make any direct notary connections
(for example, for privacy; see the Rationale section below).
The optional `/notary/supported` endpoint, along with
proxying actual notary requests, lets such a proxy
ensure that a `go` command using the proxy
never makes a direct connection to notary.golang.org.
But simpler proxies may wish to focus on serving
only modules and not notary datain particular,
module-only proxies can be served from entirely static file systems,
with no special infrastructure at all.
Such proxies can respoond with an HTTP 404 to
the `/notary/supported` endpoint, so that clients
will connect to the notary directly.
### `go` command client
The `go` command is the primary consumer of the notarys published log.
The `go` command will [verify the log](https://research.swtch.com/tlog#verifying_a_log)
as it uses it,
ensuring that every record it reads is actually in the log
and that no observed log ever drops a record from an earlier observed log.
The `go` command will store the notarys public key in
`$GOROOT/lib/notary/notary.cfg`.
That file will also contain the default starting signed tree size and tree hash,
updated with each major release.
The `go` command will then cache the latest signed tree size and tree hash
in `$GOPATH/pkg/notary/notary.golang.org/latest`.
It will cache tiles in `$GOPATH/pkg/mod/download/cache/notary/notary.golang.org/tile/H/L/K[.W]`.
These two different locations let `go clean -modcache` delete any cached tiles as well,
but no `go` command (only a manual `rm -rf $GOPATH/pkg`)
will wipe out the memory of the latest observed tree size and hash.
If the `go` command ever does observe a pair of inconsistent signed tree sizes and hashes,
it will complain loudly on standard error and fail the build.
The `go` command must be configured to know which modules are
publicly available and therefore can be verified by the notary,
versus those that are closed source and must not be verified,
especially since that would transmit potentially private import paths
over the network to the notary `/lookup` endpoint.
A few new environment variables control this configuration.
(See the [`go env -w` proposal](https://golang.org/design/30411-env)
for a way to manage these variables more easily.)
- `GOPROXY=https://proxy.site/path` sets the Go module proxy to use, as before.
- `GONOPROXY=prefix1,prefix2,prefix3` sets a list of module path prefixes,
possibly containing globs, that should not be proxied.
For example:
GONOPROXY=*.corp.google.com,rsc.io/private
will bypass the proxy for the modules foo.corp.google.com, foo.corp.google.com/bar, rsc.io/private, and rsc.io/private/bar,
though not rsc.io/privateer (the patterns are path prefixes, not string prefixes).
- `GONOVERIFY=prefix1,prefix2,prefix3` sets a list of module path prefixes,
again possibly containing globs, that should not be verified using the notary.
We expect that corporate environments may fetch all modules, public and private,
through an internal proxy;
`GONOVERIFY` allows them to disable notary-based verification of
internal modules while still verifying public modules.
Therefore, `GONOVERIFY` must not imply `GONOPROXY`.
We also expect that other users may prefer to connect directly to source origins
but still want verification of open source modules or proxying of the notary itself;
`GONOPROXY` allows them to arrange that and therefore must not imply `GONOVERIFY`.
The notary not being able to report `go.sum` lines for a module version
is a hard failure:
any private modules must be explicitly listed in `$GONOVERIFY`.
(Otherwise an attacker could block traffic to the notary
and make all module versions appear to verify.)
The notary can be disabled entirely with `GONOVERIFY=*`.
The command `go get -insecure` will report but not stop after notary failures.
## Rationale
The motivation for authenticating module downloads is
covered in the background section above.
Note that we want to authenticate modules
obtained both from direct connections to code-hosting servers
and from module proxies.
Two topics are worth further discussion:
first, having a single notary service for the entire Go ecosystem,
and second, the privacy implications of a notary.
### One Notary
The Go team at Google will run the Go notary as a service to the Go ecosystem,
similar to running `godoc.org` and `golang.org`.
There is no plan to allow use of alternate notaries,
which would add complexity and potentially reduce the overall
security of the system,
allowing different users to be attacked by compromising different notaries.
We originally considered having multiple notaries
signing individual `go.sum` entries and
requiring the `go` command to collect signatures
from a quorum of notaries before accepting an entry.
That design depended on the uptime of multiple services
and could still be compromised undetectably by
compromising enough notaries.
That is, that design would blindly trust a quorum of notaries.
The design presented here uses the transparent log
eliminates blind trust in a quorum of notaries
and instead uses a trust but verify model with
a single notary.
In this design, the notarys published `go.sum` lines
are accepted by the `go` command client,
but the published lines are also verifiably preserved
for auditing by any interested third party.
In fact, we hope that proxies run by various
organizations in the Go community will serve as auditors
and double-check Go notary log entries
as part of their ordinary operation.
Another useful
service that could be enabled by
the notary is a notification service to alert
authors about new versions of their own modules.
### Privacy
Contacting the Go notary to authenticate a new dependency
requires sending the module path and version to the notary.
There are two potential privacy concerns.
First, a misconfigured `go` command might send
the names of private module paths
(for example, `rsc.io/private/secret-plan`)
to the notary.
The notary would try to fetch the module and fail,
but the path would have been exposed in the network traffic.
Second, even using only public modules,
there might be a concern that contacting the notary
at all would expose information about how popular particular modules
are in a particular organization (or at least in a particular client IP block).
The design addresses these two privacy concerns
in two ways: with both a lightweight, partial solution
for each and a heavier, complete solution.
The lightweight, partial solution for a misconfigured `go` command
that asks the notary about a non-public module
is to make it fail as loudly as possible.
If the `go` command is configured to ask the notary
about a particular module, and the notary cannot return
information about that module, the download fails
and the `go` command stops.
This ensures both that all public modules are in fact
authenticated and also that any misconfiguration
must be corrected (by setting `$GONOVERIFY` to avoid
the notary for those private modules)
in order to achieve a successful build.
This way, the frequency of misconfiguration should be minimized.
The lightweight, partial solution for exposing information about
module usage is to only contact the notary when there is not
already an entry in `go.sum`. If a module version is already listed
in `go.sum`, it is assumed to be correct, with no notary interaction.
This allows authentication of previously-downloaded private
modules and also ensures that only the first use of a new module
version is exposed to the notary.
These lightweight solutions are meant to make the notary
usable out of the box for most Go developers.
If there are additional lightweight solutions that can be adopted
to further reduce privacy concerns,
we would be happy to consider them.
The heavier, complete solution for notary privacy concerns
is for developers to put their usage behind a proxy,
such as a local Athens instance or JFrogs GoCenter,
assuming those proxies add support for proxying and
caching the Go notary service endpoints.
(Those endpoints are designed to be highly cacheable
for exactly this reason, and a proxy with a full copy
of the notary log doesnt have to leak any information
about what modules are in use, at the cost of maintaining
its own index to answer lookup requests.)
We anticipate that there will be many proxies available
for use in the Go ecosystem.
Part of the motivation for the Go notary is to allow
the use of any available proxy to download modules,
without any reduction in security.
Developers can then use any proxy they are comfortable using,
or run their own.
## Compatibility
The introduction of the notary does not have any compatibility
concerns at the command or language level.
However, proxies that serve modified copies of public modules
will be incompatible with the notary and stop being usable.
This is by design: such proxies are indistinguishable from man-in-the-middle attacks.
## Implementation
The Go team at Google is working on a production implementation
of both a Go module proxy and the Go notary,
as we described in the blog post “[Go Modules in 2019](https://blog.golang.org/modules2019).”
We will publish a notary client as part of the `go` command,
as well as an example notary implementation.
We intend to ship support for the notary, enabled by default, in Go 1.13.
Russ Cox will lead the `go` command integration
and has posted a [stack of changes in golang.org/x/exp/notary](https://go-review.googlesource.com/q/f:notary).