Proposal: go install should install executables in module mode outside a module

Authors: Jay Conrod, Daniel Martí

Last Updated: 2020-09-29

Discussion at https://golang.org/issue/40276.

Abstract

Authors of executables need a simple, reliable, consistent way for users to build and install exectuables in module mode without updating module requirements in the current module's go.mod file.

Background

go get is used to download and install executables, but it‘s also responsible for managing dependencies in go.mod files. This causes confusion and unintended side effects: for example, the command go get golang.org/x/tools/gopls builds and installs gopls. If there’s a go.mod file in the current directory or any parent, this command also adds a requirement on the module golang.org/x/tools/gopls, which is usually not intended. When GO111MODULE is not set, go get will also run in GOPATH mode when invoked outside a module.

These problems lead authors to write complex installation commands such as:

(cd $(mktemp -d); GO111MODULE=on go get golang.org/x/tools/gopls)

Proposal

We propose augmenting the go install command to build and install packages at specific versions, regardless of the current module context.

go install golang.org/x/tools/gopls@v0.4.4

To eliminate redundancy and confusion, we also propose deprecating and removing go get functionality for building and installing packages.

Details

The new go install behavior will be enabled when an argument has a version suffix like @latest or @v1.5.2. Currently, go install does not allow version suffixes. When a version suffix is used:

  • go install runs in module mode, regardless of whether a go.mod file is present. If GO111MODULE=off, go install reports an error, similar to what go mod download and other module commands do.
  • go install acts as if no go.mod file is present in the current directory or parent directory.
  • No module will be considered the “main” module.
  • Errors are reported in some cases to ensure that consistent versions of dependencies are used by users and module authors. See Rationale below.
    • Command line arguments must not be meta-patterns (all, std, cmd) or local directories (./foo, /tmp/bar).
    • Command line arguments must refer to main packages (executables). If a argument has a wildcard (...), it will only match main packages.
    • Command line arguments must refer to packages in one module at a specific version. All version suffixes must be identical. The versions of the installed packages' dependencies are determined by that module's go.mod file (if it has one).
    • If that module has a go.mod file, it must not contain directives that would cause it to be interpreted differently if the module were the main module. In particular, it must not contain replace or exclude directives.

If go install has arguments without version suffixes, its behavior will not change. It will operate in the context of the main module. If run in module mode outside of a module, go install will report an error.

With these restrictions, users can install executables using consistent commands. Authors can provide simple installation instructions without worrying about the user's working directory.

With this change, go install would overlap with go get even more, so we also propose deprecating and removing the ability for go get to install packages.

  • In Go 1.16, when go get is invoked outside a module or when go get is invoked without the -d flag with arguments matching one or more main packages, go get would print a deprecation warning recommending an equivalent go install command.
  • In a later release (likely Go 1.17), go get would no longer build or install packages. The -d flag would be enabled by default. Setting -d=false would be an error. If go get is invoked outside a module, it would print an error recommending an equivalent go install command.

Examples

# Install a single executable at the latest version
$ go install example.com/cmd/tool@latest

# Install multiple executables at the latest version
$ go install example.com/cmd/...@latest

# Install at a specific version
$ go install example.com/cmd/tool@v1.4.2

Current go install and go get functionality

go install is used for building and installing packages within the context of the main module. go install reports an error when invoked outside of a module or when given arguments with version queries like @latest.

go get is used both for updating module dependencies in go.mod and for building and installing executables. go get also works differently depending on whether it's invoked inside or outside of a module.

These overlapping responsibilities lead to confusion. Ideally, we would have one command (go install) for installing executables and one command (go get) for changing dependencies.

Currently, when go get is invoked outside a module in module mode (with GO111MODULE=on), its primary purpose is to build and install executables. In this configuration, there is no main module, even if only one module provides packages named on the command line. The build list (the set of module versions used in the build) is calculated from requirements in go.mod files of modules providing packages named on the command line. replace or exclude directives from all modules are ignored. Vendor directories are also ignored.

When go get is invoked inside a module, its primary purpose is to update requirements in go.mod. The -d flag is often used, which instructs go get not to build or install packages. Explicit go build or go install commands are often better for installing tools when dependency versions are specified in go.mod and no update is desired. Like other build commands, go get loads the build list from the main module‘s go.mod file, applying any replace or exclude directives it finds there. replace and exclude directives in other modules’ go.mod files are never applied. Vendor directories in the main module and in other modules are ignored; the -mod=vendor flag is not allowed.

The motivation for the current go get behavior was to make usage in module mode similar to usage in GOPATH mode. In GOPATH mode, go get would download repositories for any missing packages into $GOPATH/src, then build and install those packages into $GOPATH/bin or $GOPATH/pkg. go get -u would update repositories to their latest versions. go get -d would download repositories without building packages. In module mode, go get works with requirements in go.mod instead of repositories in $GOPATH/src.

Rationale

Why can't go get clone a git repository and build from there?

In module mode, the go command typically fetches dependencies from a proxy. Modules are distributed as zip files that contain sources for specific module versions. Even when go connects directly to a repository instead of a proxy, it still generates zip files so that builds work consistently no matter how modules are fetched. Those zip files don't contain nested modules or vendor directories.

If go get cloned repositories, it would work very differently from other build commands. That causes several problems:

  • It adds complication (and bugs!) to the go command to support a new build mode.
  • It creates work for authors, who would need to ensure their programs can be built with both go get and go install.
  • It reduces speed and reliability for users. Modules may be available on a proxy when the original repository is unavailable. Fetching modules from a proxy is roughly 5-7x faster than cloning git repositories.

Why can't vendor directories be used?

Vendor directories are not included in module zip files. Since they‘re not present when a module is downloaded, there’s no way to build with them.

We don't plan to include vendor directories in zip files in the future either. Changing the set of files included in module zip files would break go.sum hashes.

Why can't directory replace directives be used?

For example:

replace example.com/sibling => ../sibling

replace directives with a directory path on the right side can‘t be used because the directory must be outside the module. These directories can’t be present when the module is downloaded, so there's no way to build with them.

Why can't module replace directives be used?

For example:

replace example.com/mod v1.0.0 => example.com/fork v1.0.1-bugfix

It is technically possible to apply these directives. If we did this, we would still want some restrictions. First, an error would be reported if more than one module provided packages named on the command line: we must be able to identify a main module. Second, an error would be reported if any directory replace directives were present: we don't want to introduce a new configuration where some replace directives are applied but others are silently ignored.

However, there are two reasons to avoid applying replace directives at all.

First, applying replace directives would create inconsistency for users inside and outside a module. When a package is built within a module with go build or go install, only replace directives from the main module are applied, not the module providing the package. When a package is built outside a module with go get, no replace directives are applied. If go install applied replace directives from the module providing the package, it would not be consistent with the current behavior of any other build command. To eliminate confusion about whether replace directives are applied, we propose that go install reports errors when encountering them.

Second, if go install applied replace directives, it would take power away from developers that depend on modules that provide tools. For example, suppose the author of a popular code generation tool gogen forks a dependency genutil to add a feature. They add a replace directive pointing to their fork of genutil while waiting for a PR to merge. A user of gogen wants to track the version they use in their go.mod file to ensure everyone on their team uses a consistent version. Unfortunately, they can no longer build gogen with go install because the replace is ignored. The author of gogen might instruct their users to build with go install, but then users can‘t track the dependency in their go.mod file, and they can’t apply their own require and replace directives to upgrade or fix other transitive dependencies. The author of gogen could also instruct their users to copy the replace directive, but this may conflict with other require and replace directives, and it may cause similar problems for users further downstream.

Why report errors instead of ignoring replace?

If go install ignored replace directives, it would be consistent with the current behavior of go get when invoked outside a module. However, in #30515 and related discussions, we found that many developers are surprised by that behavior.

It seems better to be explicit that replace directives are only applied locally within a module during development and not when users build packages from outside the module. We‘d like to encourage module authors to release versions of their modules that don’t rely on replace directives so that users in other modules may depend on them easily.

If this behavior turns out not to be suitable (for example, authors prefer to keep replace directives in go.mod at release versions and understand that they won't affect users), then we could start ignoring replace directives in the future, matching current go get behavior.

Should go.sum files be checked?

Because there is no main module, go install will not use a go.sum file to authenticate any downloaded module or go.mod file. The go command will still use the checksum database (sum.golang.org) to authenticate downloads, subject to privacy settings. This is consistent with the current behavior of go get: when invoked outside a module, no go.sum file is used.

The new go install command requires that only one module may provide packages named on the command line, so it may be logical to use that module‘s go.sum file to verify downloads. This avoids a problem in #28802, a related proposal to verify downloads against all go.sum files in dependencies: the build can’t be broken by one bad go.sum file in a dependency.

However, using the go.sum from the module named on the command line only provides a marginal security benefit: it lets us authenticate private module dependencies (those not available to the checksum database) when the module on the command line is public. If the module named on the command line is private or if the checksum database isn‘t used, then we can’t authenticate the download of its content (including the go.sum file), and we must trust the proxy. If all dependencies are public, we can authenticate all downloads without go.sum.

Why require a version suffix when outside a module?

If no version suffix were required when go install is invoked outside a module, then the meaning of the command would depend on whether the user's working directory is inside a module. For example:

go install golang.org/x/tools/gopls

When invoked outside of a module, this command would run in GOPATH mode, unless GO111MODULE=on is set. In module mode, it would install the latest version of the executable.

When invoked inside a module, this command would use the main module's go.mod file to determine the versions of the modules needed to build the package.

We currently have a similar problem with go get. Requiring the version suffix makes the meaning of a go install command unambiguous.

Why not a -g flag instead of @latest?

To install the latest version of an executable, the two commands below would be equivalent:

go install -g golang.org/x/tools/gopls
go install golang.org/x/tools/gopls@latest

The -g flag has the advantage of being shorter for a common use case. However, it would only be useful when installing the latest version of a package, since -g would be implied by any version suffix.

The @latest suffix is clearer, and it implies that the command is time-dependent and not reproducible. We prefer it for those reasons.

Compatibility

The go install part of this proposal only applies to commands with version suffixes on each argument. go install reports an error for these, and this proposal does not recommend changing other functionality of go install, so that part of the proposal is backward compatible.

The go get part of this proposal recommends deprecating and removing functionality, so it's certainly not backward compatible. go get -d commands will continue to work without modification though, and eventually, the -d flag can be dropped.

Parts of this proposal are more strict than is technically necessary (for example, requiring one module, forbidding replace directives). We could relax these restrictions without breaking compatibility in the future if it seems expedient. It would be much harder to add restrictions later.

Implementation

An initial implementation of this feature was merged in CL 254365. Please try it out!

Future directions

The behavior with respect to replace directives was discussed extensively before this proposal was written. There are three potential behaviors:

  1. Ignore replace directives in all modules. This would be consistent with other module-aware commands, which only apply replace directives from the main module (defined in the current directory or a parent directory). go install pkg@version ignores the current directory and any go.mod file that might be present, so there is no main module.
  2. Ensure only one module provides packages named on the command line, and treat that module as the main module, applying its module replace directives from it. Report errors for directory replace directives. This is feasible, but it may have wider ecosystem effects; see “Why can't module replace directives be used?” above.
  3. Ensure only one module provides packages named on the command line, and report errors for any replace directives it contains. This is the behavior currently proposed.

Most people involved in this discussion have advocated for either (1) or (2). The behavior in (3) is a compromise. If we find that the behavior in (1) is strictly better than (2) or vice versa, we can switch to that behavior from (3) without an incompatible change. Additionally, (3) eliminates ambiguity about whether replace directives are applied for users and module authors.

Note that applying directory replace directives is not considered here for the reasons in “Why can't directory replace directives be used?”.

Appendix: FAQ

Why not apply replace directives from all modules?

In short, replace directives from different modules would conflict, and that would make dependency management harder for most users.

For example, consider a case where two dependencies replace the same module with different forks.

// in example.com/mod/a
replace example.com/mod/c => example.com/fork-a/c v1.0.0

// in example.com/mod/b
replace example.com/mod/c => example.com/fork-b/c v1.0.0

Another conflict would occur where two dependencies pin different versions of the same module.

// in example.com/mod/a
replace example.com/mod/c => example.com/mod/c v1.1.0

// in example.com/mod/b
replace example.com/mod/c => example.com/mod/c v1.2.0

To avoid the possibility of conflict, the go command ignores replace directives in modules other than the main module.

Modules are intended to scale to a large ecosystem, and in order for upgrades to be safe, fast, and predictable, some rules must be followed, like semantic versioning and import compatibility. Not relying on replace is one of these rules.

How can module authors avoid replace?

replace is useful in several situations for local or short-term development, for example:

  • Changing multiple modules concurrently.
  • Using a short-term fork of a dependency until a change is merged upstream.
  • Using an old version of a dependency because a new version is broken.
  • Working around migration problems, like golang.org/x/lint imported as github.com/golang/lint. Many of these problems should be fixed by lazy module loading (#36460).

replace is safe to use in a module that is not depended on by other modules. It‘s also safe to use in revisions that aren’t depended on by other modules.

  • If a replace directive is just meant for temporary local development by one person, avoid checking it in. The -modfile flag may be used to build with an alternative go.mod file. See also #26640 a feature request for a go.mod.local file containing replacements and other local modifications.
  • If a replace directive must be checked in to fix a short-term problem, ensure at least one release or pre-release version is tagged before checking it in. Don‘t tag a new release version with replace checked in (pre-release versions may be okay, depending on how they’re used). When the go command looks for a new version of a module (for example, when running go get with no version specified), it will prefer release versions. Tagging versions lets you continue development on the main branch without worrying about users fetching arbitrary commits.
  • If a replace directive must be checked in to solve a long-term problem, consider solutions that won't cause issues for dependent modules. If possible, tag versions on a release branch with replace directives removed.

When would go install be reproducible?

The new go install command will build an executable with the same set of module versions on every invocation if both the following conditions are true:

  • A specific version is requested in the command line argument, for example, go install example.com/cmd/foo@v1.0.0.
  • Every package needed to build the executable is provided by a module required directly or indirectly by the go.mod file of the module providing the executable. If the executable only imports standard library packages or packages from its own module, no go.mod file is necessary.

An executable may not be bit-for-bit reproducible for other reasons. Debugging information will include system paths (unless -trimpath is used). A package may import different packages on different platforms (or may not build at all). The installed Go version and the C toolchain may also affect binary reproducibility.

What happens if a module depends on a newer version of itself?

go install will report an error, as go get already does.

This sometimes happens when two modules depend on each other, and releases are not tagged on the main branch. A command like go get example.com/m@master will resolve @master to a pseudo-version lower than any release version. The go.mod file at that pseudo-version may transitively depend on a newer release version.

go get reports an error in this situation. In general, go get reports an error when command line arguments different versions of the same module, directly or indirectly. go install doesn't support this yet, but this should be one of the conditions checked when running with version suffix arguments.

Appendix: usage of replace directives

In this proposal, go install would report errors for replace directives in the module providing packages named on the command line. go get ignores these, but the behavior may still surprise module authors and users. I've tried to estimate the impact on the existing set of open source modules.

  • I started with a list of 359,040 main packages that Russ Cox built during an earlier study.
  • I excluded packages with paths that indicate they were homework, examples, tests, or experiments. 187,805 packages remained.
  • Of these, I took a random sample of 19,000 packages (about 10%).
  • These belonged to 13,874 modules. For each module, I downloaded the “latest” version go get would fetch.
  • I discarded repositories that were forks or couldn't be retrieved. 10,618 modules were left.
  • I discarded modules that didn't have a go.mod file. 4,519 were left.
  • Of these:
    • 3982 (88%) don't use replace at all.
    • 71 (2%) use directory replace only.
    • 439 (9%) use module replace only.
    • 27 (1%) use both.
    • In the set of 439 go.mod files using module replace only, I tried to classify why replace was used. A module may have multiple replace directives and multiple classifications, so the percentages below don't add to 100%.
    • 165 used replace as a soft fork, for example, to point to a bug fix PR instead of the original module.
    • 242 used replace to pin a specific version of a dependency (the module path is the same on both sides).
    • 77 used replace to rename a dependency that was imported with another name, for example, replacing github.com/golang/lint with the correct path, golang.org/x/lint.
    • 30 used replace to rename golang.org/x repos with their github.com/golang mirrors.
    • 11 used replace to bypass semantic import versioning.
    • 167 used replace with k8s.io modules. Kubernetes has used replace to bypass MVS, and dependent modules have been forced to do the same.
    • 111 modules contained replace directives I couldn't automatically classify. The ones I looked at seemed to mostly be forks or pins.

The modules I‘m most concerned about are those that use replace as a soft fork while submitting a bug fix to an upstream module; other problems have other solutions that I don’t think we need to design for here. Modules using soft fork replacements are about 4% of the the modules with go.mod files I sampled (165 / 4519). This is a small enough set that I think we should move forward with the proposal above.