Ian Lance Taylor | 6a2ebdf | 2022-07-15 10:45:48 -0700 | [diff] [blame] | 1 | # Go generate: A Proposal |
| 2 | |
| 3 | Author: Rob Pike |
| 4 | |
| 5 | Accepted in the Go 1.4 release. |
| 6 | |
| 7 | ## Introduction |
| 8 | |
| 9 | The go build command automates the construction of Go programs but |
| 10 | sometimes preliminary processing is required, processing that go build |
| 11 | does not support. |
| 12 | Motivating examples include: |
| 13 | |
| 14 | - yacc: generating .go files from yacc grammar (.y) files |
| 15 | - protobufs: generating .pb.go files from protocol buffer definition (.proto) files |
| 16 | - Unicode: generating tables from UnicodeData.txt |
| 17 | - HTML: embedding .html files into Go source code |
| 18 | - bindata: translating binary files such as JPEGs into byte arrays in Go source |
| 19 | |
| 20 | There are other processing steps one can imagine: |
| 21 | |
| 22 | - string methods: generating String() string methods for types used as enumerated constants |
| 23 | - macros: generating customized implementations given generalized packages, such as sort.Ints from ints |
| 24 | |
| 25 | This proposal offers a design for smooth automation of such processing. |
| 26 | |
| 27 | ## Non-goal |
| 28 | |
| 29 | It is not a goal of this proposal to build a generalized build system |
| 30 | like the Unix make(1) utility. |
| 31 | We deliberately avoid doing any dependency analysis. |
| 32 | The tool does what is asked of it, nothing more. |
| 33 | |
| 34 | It is hoped, however, that it may replace many existing uses of |
| 35 | make(1) in the Go repo at least. |
| 36 | |
| 37 | ## Design |
| 38 | |
| 39 | There are two basic elements, a new subcommand for the go command, |
| 40 | called go generate, and directives inside Go source files that control |
| 41 | generation. |
| 42 | |
| 43 | When go generate runs, it scans Go source files looking for those |
| 44 | directives, and for each one executes a generator that typically |
| 45 | creates a new Go source file. |
| 46 | The go generate tool also sets the build tag "generate" so that files |
| 47 | may be examined by go generate but ignored during build. |
| 48 | |
| 49 | The usage is: |
| 50 | |
| 51 | ``` |
| 52 | go generate [-run regexp] [file.go...|packagePath...] |
| 53 | ``` |
| 54 | |
| 55 | (Plus the usual `-x`, `-n`, `-v` and `-tags` options.) |
| 56 | If packages are named, each Go source file in each package is scanned |
| 57 | for generator directives, and for each directive, the specified |
| 58 | generator is run; if files are named, they must be Go source files and |
| 59 | generation happens only for directives in those files. |
| 60 | Given no arguments, generator processing is applied to the Go source |
| 61 | files in the current directory. |
| 62 | |
| 63 | The `-run` flag takes a regular expression, analogous to that of the |
| 64 | go test subcommand, that restricts generation to those directives |
| 65 | whose command (see below) matches the regular expression. |
| 66 | |
| 67 | Generator directives may appear anywhere in the Go source file and are |
| 68 | processed sequentially (no parallelism) in source order as presented |
| 69 | to the tool. |
| 70 | Each directive is a // comment beginning a line, with syntax |
| 71 | |
| 72 | ``` |
| 73 | //go:generate command arg... |
| 74 | ``` |
| 75 | |
| 76 | where command is the generator (such as `yacc`) to be run, |
| 77 | corresponding to an executable file that can be run locally; it must |
| 78 | either be in the shell path (`gofmt`) or fully qualified |
| 79 | (`/usr/you/bin/mytool`) and is run in the package directory. |
| 80 | |
| 81 | The arguments are space-separated tokens (or double-quoted strings) |
| 82 | passed to the generator as individual arguments when it is run. |
| 83 | Shell-like variable expansion is available for any environment |
| 84 | variables such as `$HOME`. |
| 85 | Also, the special variable `$GOFILE` refers to the name of the file |
| 86 | containing the directive. |
| 87 | (We may need other special variables such as `$GOPACKAGE`. |
| 88 | When the generator is run, these are also provided in the shell |
| 89 | environment.) |
| 90 | No other special processing, such as globbing, is provided. |
| 91 | |
| 92 | No further generators are run if any generator returns an error exit |
| 93 | status. |
| 94 | |
| 95 | As an example, say we have a package `my/own/gopher` that includes a |
| 96 | yacc grammar in file `gopher.y`. |
| 97 | Inside `main.go` (not `gopher.y`) we place the directive |
| 98 | |
| 99 | ``` |
| 100 | //go:generate yacc -o gopher.go gopher.y |
| 101 | ``` |
| 102 | |
| 103 | (More about what `yacc` means in the next section.) |
| 104 | Whenever we need to update the generated file, we give the shell |
| 105 | command, |
| 106 | |
| 107 | ``` |
| 108 | % go generate my/own/gopher |
| 109 | ``` |
| 110 | |
| 111 | or, if we are already in the source directory, |
| 112 | |
| 113 | ``` |
| 114 | % go generate |
| 115 | ``` |
| 116 | |
| 117 | If we want to make sure that only the yacc generator is run, we |
| 118 | execute |
| 119 | |
| 120 | ``` |
| 121 | % go generate -run yacc |
| 122 | ``` |
| 123 | |
| 124 | If we have fixed a bug in yacc and want to update all yacc-generated |
| 125 | files in our tree, we can run |
| 126 | |
| 127 | ``` |
| 128 | % go generate -run yacc all |
| 129 | ``` |
| 130 | |
| 131 | The typical cycle for a package author developing software that uses |
| 132 | `go generate` is |
| 133 | |
| 134 | ``` |
| 135 | % edit … |
| 136 | % go generate |
| 137 | % go test |
| 138 | ``` |
| 139 | |
| 140 | and once things are settled, the author commits the generated files to |
| 141 | the source repository, so that they are available to clients that use |
| 142 | go get: |
| 143 | |
| 144 | ``` |
| 145 | % git add *.go |
| 146 | % git commit |
| 147 | ``` |
| 148 | |
| 149 | ## Commands |
| 150 | |
| 151 | The yacc program is of course not the standard version, but is |
| 152 | accessed from the command line by |
| 153 | |
| 154 | ``` |
| 155 | go tool yacc args... |
| 156 | ``` |
| 157 | |
| 158 | To make it easy to use tools like yacc that are not installed in |
| 159 | $PATH, have complex access methods, or benefit from extra flags or |
| 160 | other wrapping, there is a special directive that defines a shorthand |
| 161 | for a command. |
| 162 | It is a `go:generate` directive followed by the keyword/flag |
| 163 | `-command` and which generator it defines; the rest of the line is |
| 164 | substituted for the command name when the generator is run. |
| 165 | Thus to define `yacc` as a generator command we access normally by |
| 166 | running `go tool yacc`, we first write the directive |
| 167 | |
| 168 | ``` |
| 169 | //go:generate -command yacc go tool yacc |
| 170 | ``` |
| 171 | |
| 172 | and then all other generator directives using `yacc` that follow in |
| 173 | that file (only) can be written as above: |
| 174 | |
| 175 | ``` |
| 176 | //go:generate yacc -o gopher.go gopher.y |
| 177 | ``` |
| 178 | |
| 179 | which will be translated to |
| 180 | |
| 181 | ``` |
| 182 | go tool yacc -o gopher.go gopher.y |
| 183 | ``` |
| 184 | |
| 185 | when run. |
| 186 | |
| 187 | ## Discussion |
| 188 | |
| 189 | This design is unusual but is driven by several motivating principles. |
| 190 | |
| 191 | First, `go generate` is intended[^1] to be run by the author of a |
| 192 | package, not the client of it. |
| 193 | The author of the package generates the required Go files and includes |
| 194 | them in the package; the client does a regular `go get` or `go |
| 195 | build`. |
| 196 | Generation through `go generate` is not part of the build, just a tool |
| 197 | for package authors. |
| 198 | This avoids complicating the dependency analysis done by Go build. |
| 199 | |
| 200 | [^1]: One can imagine scenarios where the author wishes the client to |
| 201 | run the generator, but in such cases the author must guarantee that |
| 202 | the client has the generator available. |
| 203 | Regardless, `go get` will not automate the running of the processor, |
| 204 | so further installation instructions will need to be provided by the |
| 205 | author. |
| 206 | |
| 207 | Second, `go build` should never cause generation to happen |
| 208 | automatically by the client of the package. Generators should run only |
| 209 | when explicitly requested. |
| 210 | |
| 211 | Third, the author of the package should have great freedom in what |
| 212 | generator to use (that is a key goal of the proposal), but the client |
| 213 | might not have that processor available. |
| 214 | As a simple example, if it is a shell script, it will not run on |
| 215 | Windows. |
| 216 | It is important that automated generation not break clients but be |
| 217 | invisible to them, which is another reason it should be run only by |
| 218 | the author of the package. |
| 219 | |
| 220 | Finally, it must fit well with the existing go command, which means it |
| 221 | applies only to Go source files and packages. |
| 222 | This is why the directives are in Go files but not, for example, in |
| 223 | the .y file holding a yacc grammar. |
| 224 | |
| 225 | ## Examples |
| 226 | |
| 227 | Here are some hypothetical worked examples. |
| 228 | There are countless more possibilities. |
| 229 | |
| 230 | ### String methods |
| 231 | |
| 232 | We wish to generate a String method for a named constant type. |
| 233 | We write a tool, say `strmeth`, that reads a definition for a single |
| 234 | constant type and values and prints a complete Go source file |
| 235 | containing a method definition for that type. |
| 236 | |
| 237 | In our Go source file, `main.go`, we decorate each constant |
| 238 | declaration like this (with some blank lines interposed so the |
| 239 | generator directive does not appear in the doc comment): |
| 240 | |
| 241 | ```Go |
| 242 | //go:generate strmeth Day -o day_string.go $GOFILE |
| 243 | |
| 244 | // Day represents the day of the week |
| 245 | type Day int |
| 246 | const ( |
| 247 | Sunday Day = iota |
| 248 | Monday |
| 249 | ... |
| 250 | ) |
| 251 | ``` |
| 252 | |
| 253 | The `strmeth` generator parses the Go source to find the definition of |
| 254 | the `Day` type and its constants, and writes out a `String() string` |
| 255 | method for that type. |
| 256 | For the user, generation of the string method is trivial: just run `go |
| 257 | generate`. |
| 258 | |
| 259 | ### Yacc |
| 260 | |
| 261 | As outlined above, we define a custom command |
| 262 | |
| 263 | ``` |
| 264 | //go:generate -command yacc go tool yacc |
| 265 | ``` |
| 266 | |
| 267 | and then anywhere in main.go (say) we write |
| 268 | |
| 269 | ``` |
| 270 | //go:generate yacc -o foo.go foo.y |
| 271 | ``` |
| 272 | |
| 273 | ### Protocol buffers |
| 274 | |
| 275 | The process is the same as with yacc. |
| 276 | Inside `main.go`, we write, for each protocol buffer file we have, a |
| 277 | line like |
| 278 | |
| 279 | ``` |
| 280 | //go:generate protoc -go_out=. file.proto |
| 281 | ``` |
| 282 | |
| 283 | Because of the way protoc works, we could generate multiple proto |
| 284 | definitions into a single `.pb.go` file like this: |
| 285 | |
| 286 | ``` |
| 287 | //go:generate protoc -go_out=. file1.proto file2.proto |
| 288 | ``` |
| 289 | |
| 290 | Since no globbing is provided, one cannot say `*.proto`, but this is |
| 291 | intentional, for simplicity and clarity of dependency. |
| 292 | |
| 293 | Caveat: The protoc program must be run at the root of the source tree; |
| 294 | we would need to provide a `-cd` option to it or wrap it somehow. |
| 295 | |
| 296 | ### Binary data |
| 297 | |
| 298 | A tool that converts binary files into byte arrays that can be |
| 299 | compiled into Go binaries would work similarly. |
| 300 | Again, in the Go source we write something like |
| 301 | |
| 302 | ``` |
| 303 | //go:generate bindata -o jpegs.go pic1.jpg pic2.jpg pic3.jpg |
| 304 | ``` |
| 305 | |
| 306 | This is also demonstrates another reason the annotations are in Go |
| 307 | source: there is no easy way to inject them into binary files. |
| 308 | |
| 309 | ### Sort |
| 310 | |
| 311 | One could imagine a variant sort implementation that allows one to |
| 312 | specify concrete types that have custom sorters, just by automatic |
| 313 | rewriting of macro-like sort definition. |
| 314 | To do this, we write a `sort.go` file that contains a complete |
| 315 | implementation of sort on an explicit but undefined type spelled, say, |
| 316 | `TYPE`. |
| 317 | In that file we provide a build tag so it is never compiled (`TYPE` is |
| 318 | not defined, so it won't compile) but is processed by `go generate`: |
| 319 | |
| 320 | ``` |
| 321 | // +build generate |
| 322 | ``` |
| 323 | |
| 324 | Then we write an generator directive for each type for which we want a |
| 325 | custom sort: |
| 326 | |
| 327 | ``` |
| 328 | //go:generate rename TYPE=int |
| 329 | //go:generate rename TYPE=strings |
| 330 | ``` |
| 331 | |
| 332 | or perhaps |
| 333 | |
| 334 | ``` |
| 335 | //go:generate rename TYPE=int TYPE=strings |
| 336 | ``` |
| 337 | |
| 338 | The rename processor would be a simple wrapping of `gofmt -r`, perhaps |
| 339 | written as a shell script. |
| 340 | |
| 341 | There are many more possibilities, and it is a goal of this proposal |
| 342 | to encourage experimentation with pre-build-time code generation. |