blob: 91290179a1b2a69f89c50d5a98d4b9d0f8b9b4ef [file] [log] [blame] [view]
# Go generate: A Proposal
Author: Rob Pike
Accepted in the Go 1.4 release.
## Introduction
The go build command automates the construction of Go programs but
sometimes preliminary processing is required, processing that go build
does not support.
Motivating examples include:
- yacc: generating .go files from yacc grammar (.y) files
- protobufs: generating .pb.go files from protocol buffer definition (.proto) files
- Unicode: generating tables from UnicodeData.txt
- HTML: embedding .html files into Go source code
- bindata: translating binary files such as JPEGs into byte arrays in Go source
There are other processing steps one can imagine:
- string methods: generating String() string methods for types used as enumerated constants
- macros: generating customized implementations given generalized packages, such as sort.Ints from ints
This proposal offers a design for smooth automation of such processing.
## Non-goal
It is not a goal of this proposal to build a generalized build system
like the Unix make(1) utility.
We deliberately avoid doing any dependency analysis.
The tool does what is asked of it, nothing more.
It is hoped, however, that it may replace many existing uses of
make(1) in the Go repo at least.
## Design
There are two basic elements, a new subcommand for the go command,
called go generate, and directives inside Go source files that control
generation.
When go generate runs, it scans Go source files looking for those
directives, and for each one executes a generator that typically
creates a new Go source file.
The go generate tool also sets the build tag "generate" so that files
may be examined by go generate but ignored during build.
The usage is:
```
go generate [-run regexp] [file.go...|packagePath...]
```
(Plus the usual `-x`, `-n`, `-v` and `-tags` options.)
If packages are named, each Go source file in each package is scanned
for generator directives, and for each directive, the specified
generator is run; if files are named, they must be Go source files and
generation happens only for directives in those files.
Given no arguments, generator processing is applied to the Go source
files in the current directory.
The `-run` flag takes a regular expression, analogous to that of the
go test subcommand, that restricts generation to those directives
whose command (see below) matches the regular expression.
Generator directives may appear anywhere in the Go source file and are
processed sequentially (no parallelism) in source order as presented
to the tool.
Each directive is a // comment beginning a line, with syntax
```
//go:generate command arg...
```
where command is the generator (such as `yacc`) to be run,
corresponding to an executable file that can be run locally; it must
either be in the shell path (`gofmt`) or fully qualified
(`/usr/you/bin/mytool`) and is run in the package directory.
The arguments are space-separated tokens (or double-quoted strings)
passed to the generator as individual arguments when it is run.
Shell-like variable expansion is available for any environment
variables such as `$HOME`.
Also, the special variable `$GOFILE` refers to the name of the file
containing the directive.
(We may need other special variables such as `$GOPACKAGE`.
When the generator is run, these are also provided in the shell
environment.)
No other special processing, such as globbing, is provided.
No further generators are run if any generator returns an error exit
status.
As an example, say we have a package `my/own/gopher` that includes a
yacc grammar in file `gopher.y`.
Inside `main.go` (not `gopher.y`) we place the directive
```
//go:generate yacc -o gopher.go gopher.y
```
(More about what `yacc` means in the next section.)
Whenever we need to update the generated file, we give the shell
command,
```
% go generate my/own/gopher
```
or, if we are already in the source directory,
```
% go generate
```
If we want to make sure that only the yacc generator is run, we
execute
```
% go generate -run yacc
```
If we have fixed a bug in yacc and want to update all yacc-generated
files in our tree, we can run
```
% go generate -run yacc all
```
The typical cycle for a package author developing software that uses
`go generate` is
```
% edit …
% go generate
% go test
```
and once things are settled, the author commits the generated files to
the source repository, so that they are available to clients that use
go get:
```
% git add *.go
% git commit
```
## Commands
The yacc program is of course not the standard version, but is
accessed from the command line by
```
go tool yacc args...
```
To make it easy to use tools like yacc that are not installed in
$PATH, have complex access methods, or benefit from extra flags or
other wrapping, there is a special directive that defines a shorthand
for a command.
It is a `go:generate` directive followed by the keyword/flag
`-command` and which generator it defines; the rest of the line is
substituted for the command name when the generator is run.
Thus to define `yacc` as a generator command we access normally by
running `go tool yacc`, we first write the directive
```
//go:generate -command yacc go tool yacc
```
and then all other generator directives using `yacc` that follow in
that file (only) can be written as above:
```
//go:generate yacc -o gopher.go gopher.y
```
which will be translated to
```
go tool yacc -o gopher.go gopher.y
```
when run.
## Discussion
This design is unusual but is driven by several motivating principles.
First, `go generate` is intended[^1] to be run by the author of a
package, not the client of it.
The author of the package generates the required Go files and includes
them in the package; the client does a regular `go get` or `go
build`.
Generation through `go generate` is not part of the build, just a tool
for package authors.
This avoids complicating the dependency analysis done by Go build.
[^1]: One can imagine scenarios where the author wishes the client to
run the generator, but in such cases the author must guarantee that
the client has the generator available.
Regardless, `go get` will not automate the running of the processor,
so further installation instructions will need to be provided by the
author.
Second, `go build` should never cause generation to happen
automatically by the client of the package. Generators should run only
when explicitly requested.
Third, the author of the package should have great freedom in what
generator to use (that is a key goal of the proposal), but the client
might not have that processor available.
As a simple example, if it is a shell script, it will not run on
Windows.
It is important that automated generation not break clients but be
invisible to them, which is another reason it should be run only by
the author of the package.
Finally, it must fit well with the existing go command, which means it
applies only to Go source files and packages.
This is why the directives are in Go files but not, for example, in
the .y file holding a yacc grammar.
## Examples
Here are some hypothetical worked examples.
There are countless more possibilities.
### String methods
We wish to generate a String method for a named constant type.
We write a tool, say `strmeth`, that reads a definition for a single
constant type and values and prints a complete Go source file
containing a method definition for that type.
In our Go source file, `main.go`, we decorate each constant
declaration like this (with some blank lines interposed so the
generator directive does not appear in the doc comment):
```Go
//go:generate strmeth Day -o day_string.go $GOFILE
// Day represents the day of the week
type Day int
const (
Sunday Day = iota
Monday
...
)
```
The `strmeth` generator parses the Go source to find the definition of
the `Day` type and its constants, and writes out a `String() string`
method for that type.
For the user, generation of the string method is trivial: just run `go
generate`.
### Yacc
As outlined above, we define a custom command
```
//go:generate -command yacc go tool yacc
```
and then anywhere in main.go (say) we write
```
//go:generate yacc -o foo.go foo.y
```
### Protocol buffers
The process is the same as with yacc.
Inside `main.go`, we write, for each protocol buffer file we have, a
line like
```
//go:generate protoc -go_out=. file.proto
```
Because of the way protoc works, we could generate multiple proto
definitions into a single `.pb.go` file like this:
```
//go:generate protoc -go_out=. file1.proto file2.proto
```
Since no globbing is provided, one cannot say `*.proto`, but this is
intentional, for simplicity and clarity of dependency.
Caveat: The protoc program must be run at the root of the source tree;
we would need to provide a `-cd` option to it or wrap it somehow.
### Binary data
A tool that converts binary files into byte arrays that can be
compiled into Go binaries would work similarly.
Again, in the Go source we write something like
```
//go:generate bindata -o jpegs.go pic1.jpg pic2.jpg pic3.jpg
```
This is also demonstrates another reason the annotations are in Go
source: there is no easy way to inject them into binary files.
### Sort
One could imagine a variant sort implementation that allows one to
specify concrete types that have custom sorters, just by automatic
rewriting of macro-like sort definition.
To do this, we write a `sort.go` file that contains a complete
implementation of sort on an explicit but undefined type spelled, say,
`TYPE`.
In that file we provide a build tag so it is never compiled (`TYPE` is
not defined, so it won't compile) but is processed by `go generate`:
```
// +build generate
```
Then we write an generator directive for each type for which we want a
custom sort:
```
//go:generate rename TYPE=int
//go:generate rename TYPE=strings
```
or perhaps
```
//go:generate rename TYPE=int TYPE=strings
```
The rename processor would be a simple wrapping of `gofmt -r`, perhaps
written as a shell script.
There are many more possibilities, and it is a goal of this proposal
to encourage experimentation with pre-build-time code generation.