Rob Pike | 5216eb8 | 2014-12-22 13:33:16 +1100 | [diff] [blame] | 1 | Generating code |
| 2 | 22 Dec 2014 |
| 3 | Tags: programming, technical |
| 4 | |
| 5 | Rob Pike |
| 6 | |
| 7 | * Generating code |
| 8 | |
| 9 | A property of universal computation—Turing completeness—is that a computer program can write a computer program. |
| 10 | This is a powerful idea that is not appreciated as often as it might be, even though it happens frequently. |
| 11 | It's a big part of the definition of a compiler, for instance. |
| 12 | It's also how the `go` `test` command works: it scans the packages to be tested, |
| 13 | writes out a Go program containing a test harness customized for the package, |
| 14 | and then compiles and runs it. |
| 15 | Modern computers are so fast this expensive-sounding sequence can complete in a fraction of a second. |
| 16 | |
| 17 | There are lots of other examples of programs that write programs. |
Pravendra Singh | 83eaf87 | 2017-05-26 00:23:31 +0530 | [diff] [blame] | 18 | [[https://godoc.org/golang.org/x/tools/cmd/goyacc][Yacc]], for instance, reads in a description of a grammar and writes out a program to parse that grammar. |
Rob Pike | 5216eb8 | 2014-12-22 13:33:16 +1100 | [diff] [blame] | 19 | The protocol buffer "compiler" reads an interface description and emits structure definitions, |
| 20 | methods, and other support code. |
| 21 | Configuration tools of all sorts work like this too, examining metadata or the environment |
| 22 | and emitting scaffolding customized to the local state. |
| 23 | |
| 24 | Programs that write programs are therefore important elements in software engineering, |
| 25 | but programs like Yacc that produce source code need to be integrated into the build |
| 26 | process so their output can be compiled. |
| 27 | When an external build tool like Make is being used, this is usually easy to do. |
| 28 | But in Go, whose go tool gets all necessary build information from the Go source, there is a problem. |
| 29 | There is simply no mechanism to run Yacc from the go tool alone. |
| 30 | |
| 31 | Until now, that is. |
| 32 | |
Brad Fitzpatrick | 6788987 | 2018-04-13 19:58:08 +0000 | [diff] [blame] | 33 | The [[https://blog.golang.org/go1.4][latest Go release]], 1.4, |
Rob Pike | 5216eb8 | 2014-12-22 13:33:16 +1100 | [diff] [blame] | 34 | includes a new command that makes it easier to run such tools. |
| 35 | It's called `go` `generate`, and it works by scanning for special comments in Go source code |
| 36 | that identify general commands to run. |
| 37 | It's important to understand that `go` `generate` is not part of `go` `build`. |
| 38 | It contains no dependency analysis and must be run explicitly before running `go` `build`. |
| 39 | It is intended to be used by the author of the Go package, not its clients. |
| 40 | |
| 41 | The `go` `generate` command is easy to use. |
| 42 | As a warmup, here's how to use it to generate a Yacc grammar. |
Pravendra Singh | 83eaf87 | 2017-05-26 00:23:31 +0530 | [diff] [blame] | 43 | |
| 44 | First, install Go's Yacc tool: |
| 45 | |
| 46 | go get golang.org/x/tools/cmd/goyacc |
| 47 | |
Rob Pike | 5216eb8 | 2014-12-22 13:33:16 +1100 | [diff] [blame] | 48 | Say you have a Yacc input file called `gopher.y` that defines a grammar for your new language. |
| 49 | To produce the Go source file implementing the grammar, |
Pravendra Singh | 83eaf87 | 2017-05-26 00:23:31 +0530 | [diff] [blame] | 50 | you would normally invoke the command like this: |
Rob Pike | 5216eb8 | 2014-12-22 13:33:16 +1100 | [diff] [blame] | 51 | |
Pravendra Singh | 83eaf87 | 2017-05-26 00:23:31 +0530 | [diff] [blame] | 52 | goyacc -o gopher.go -p parser gopher.y |
Rob Pike | 5216eb8 | 2014-12-22 13:33:16 +1100 | [diff] [blame] | 53 | |
| 54 | The `-o` option names the output file while `-p` specifies the package name. |
| 55 | |
| 56 | To have `go` `generate` drive the process, in any one of the regular (non-generated) `.go` files |
| 57 | in the same directory, add this comment anywhere in the file: |
| 58 | |
Pravendra Singh | 83eaf87 | 2017-05-26 00:23:31 +0530 | [diff] [blame] | 59 | //go:generate goyacc -o gopher.go -p parser gopher.y |
Rob Pike | 5216eb8 | 2014-12-22 13:33:16 +1100 | [diff] [blame] | 60 | |
| 61 | This text is just the command above prefixed by a special comment recognized by `go` `generate`. |
| 62 | The comment must start at the beginning of the line and have no spaces between the `//` and the `go:generate`. |
| 63 | After that marker, the rest of the line specifies a command for `go` `generate` to run. |
| 64 | |
| 65 | Now run it. Change to the source directory and run `go` `generate`, then `go` `build` and so on: |
| 66 | |
| 67 | $ cd $GOPATH/myrepo/gopher |
| 68 | $ go generate |
| 69 | $ go build |
| 70 | $ go test |
| 71 | |
| 72 | That's it. |
| 73 | Assuming there are no errors, the `go` `generate` command will invoke `yacc` to create `gopher.go`, |
| 74 | at which point the directory holds the full set of Go source files, so we can build, test, and work normally. |
| 75 | Every time `gopher.y` is modified, just rerun `go` `generate` to regenerate the parser. |
| 76 | |
| 77 | For more details about how `go` `generate` works, including options, environment variables, |
Agniva De Sarker | 7edc962 | 2018-04-14 00:23:09 +0530 | [diff] [blame] | 78 | and so on, see the [[https://golang.org/s/go1.4-generate][design document]]. |
Rob Pike | 5216eb8 | 2014-12-22 13:33:16 +1100 | [diff] [blame] | 79 | |
| 80 | Go generate does nothing that couldn't be done with Make or some other build mechanism, |
| 81 | but it comes with the `go` tool—no extra installation required—and fits nicely into the Go ecosystem. |
| 82 | Just keep in mind that it is for package authors, not clients, |
| 83 | if only for the reason that the program it invokes might not be available on the target machine. |
| 84 | Also, if the containing package is intended for import by `go` `get`, |
| 85 | once the file is generated (and tested!) it must be checked into the |
| 86 | source code repository to be available to clients. |
| 87 | |
| 88 | Now that we have it, let's use it for something new. |
| 89 | As a very different example of how `go` `generate` can help, there is a new program available in the |
| 90 | `golang.org/x/tools` repository called `stringer`. |
| 91 | It automatically writes string methods for sets of integer constants. |
| 92 | It's not part of the released distribution, but it's easy to install: |
| 93 | |
| 94 | $ go get golang.org/x/tools/cmd/stringer |
| 95 | |
| 96 | Here's an example from the documentation for |
Brad Fitzpatrick | 6788987 | 2018-04-13 19:58:08 +0000 | [diff] [blame] | 97 | [[https://godoc.org/golang.org/x/tools/cmd/stringer][`stringer`]]. |
Rob Pike | 5216eb8 | 2014-12-22 13:33:16 +1100 | [diff] [blame] | 98 | Imagine we have some code that contains a set of integer constants defining different types of pills: |
| 99 | |
| 100 | package painkiller |
| 101 | |
| 102 | type Pill int |
| 103 | |
| 104 | const ( |
| 105 | Placebo Pill = iota |
| 106 | Aspirin |
| 107 | Ibuprofen |
| 108 | Paracetamol |
| 109 | Acetaminophen = Paracetamol |
| 110 | ) |
| 111 | |
| 112 | For debugging, we'd like these constants to pretty-print themselves, which means we want a method with signature, |
| 113 | |
| 114 | func (p Pill) String() string |
| 115 | |
| 116 | It's easy to write one by hand, perhaps like this: |
| 117 | |
| 118 | func (p Pill) String() string { |
| 119 | switch p { |
| 120 | case Placebo: |
| 121 | return "Placebo" |
| 122 | case Aspirin: |
| 123 | return "Aspirin" |
| 124 | case Ibuprofen: |
| 125 | return "Ibuprofen" |
| 126 | case Paracetamol: // == Acetaminophen |
| 127 | return "Paracetamol" |
| 128 | } |
| 129 | return fmt.Sprintf("Pill(%d)", p) |
| 130 | } |
| 131 | |
| 132 | There are other ways to write this function, of course. |
| 133 | We could use a slice of strings indexed by Pill, or a map, or some other technique. |
| 134 | Whatever we do, we need to maintain it if we change the set of pills, and we need to make sure it's correct. |
| 135 | (The two names for paracetamol make this trickier than it might otherwise be.) |
| 136 | Plus the very question of which approach to take depends on the types and values: |
| 137 | signed or unsigned, dense or sparse, zero-based or not, and so on. |
| 138 | |
| 139 | The `stringer` program takes care of all these details. |
| 140 | Although it can be run in isolation, it is intended to be driven by `go` `generate`. |
| 141 | To use it, add a generate comment to the source, perhaps near the type definition: |
| 142 | |
| 143 | //go:generate stringer -type=Pill |
| 144 | |
| 145 | This rule specifies that `go` `generate` should run the `stringer` tool to generate a `String` method for type `Pill`. |
| 146 | The output is automatically written to `pill_string.go` (a default we could override with the |
| 147 | `-output` flag). |
| 148 | |
| 149 | Let's run it: |
| 150 | |
| 151 | $ go generate |
| 152 | $ cat pill_string.go |
Robin Eklind | 933f9aa | 2019-03-15 07:08:50 +0000 | [diff] [blame] | 153 | // Code generated by stringer -type Pill pill.go; DO NOT EDIT. |
Russ Cox | 7fd29cb | 2020-03-09 23:23:49 -0400 | [diff] [blame] | 154 | |
Carlos Souza | eee245a | 2018-10-21 13:54:43 +0000 | [diff] [blame] | 155 | package painkiller |
Russ Cox | 7fd29cb | 2020-03-09 23:23:49 -0400 | [diff] [blame] | 156 | |
Rob Pike | 5216eb8 | 2014-12-22 13:33:16 +1100 | [diff] [blame] | 157 | import "fmt" |
Russ Cox | 7fd29cb | 2020-03-09 23:23:49 -0400 | [diff] [blame] | 158 | |
Rob Pike | 5216eb8 | 2014-12-22 13:33:16 +1100 | [diff] [blame] | 159 | const _Pill_name = "PlaceboAspirinIbuprofenParacetamol" |
Russ Cox | 7fd29cb | 2020-03-09 23:23:49 -0400 | [diff] [blame] | 160 | |
Rob Pike | b35b5e3 | 2014-12-23 08:00:39 +1100 | [diff] [blame] | 161 | var _Pill_index = [...]uint8{0, 7, 14, 23, 34} |
Russ Cox | 7fd29cb | 2020-03-09 23:23:49 -0400 | [diff] [blame] | 162 | |
Rob Pike | 5216eb8 | 2014-12-22 13:33:16 +1100 | [diff] [blame] | 163 | func (i Pill) String() string { |
Rob Pike | d100f69 | 2014-12-23 10:09:38 +1100 | [diff] [blame] | 164 | if i < 0 || i+1 >= Pill(len(_Pill_index)) { |
Rob Pike | 5216eb8 | 2014-12-22 13:33:16 +1100 | [diff] [blame] | 165 | return fmt.Sprintf("Pill(%d)", i) |
| 166 | } |
Rob Pike | b35b5e3 | 2014-12-23 08:00:39 +1100 | [diff] [blame] | 167 | return _Pill_name[_Pill_index[i]:_Pill_index[i+1]] |
Rob Pike | 5216eb8 | 2014-12-22 13:33:16 +1100 | [diff] [blame] | 168 | } |
| 169 | $ |
| 170 | |
| 171 | Every time we change the definition of `Pill` or the constants, all we need to do is run |
| 172 | |
| 173 | $ go generate |
| 174 | |
| 175 | to update the `String` method. |
| 176 | And of course if we've got multiple types set up this way in the same package, |
| 177 | that single command will update all their `String` methods with a single command. |
| 178 | |
| 179 | There's no question the generated method is ugly. |
| 180 | That's OK, though, because humans don't need to work on it; machine-generated code is often ugly. |
| 181 | It's working hard to be efficient. |
| 182 | All the names are smashed together into a single string, |
| 183 | which saves memory (only one string header for all the names, even if there are zillions of them). |
| 184 | Then an array, `_Pill_index`, maps from value to name by a simple, efficient technique. |
| 185 | Note too that `_Pill_index` is an array (not a slice; one more header eliminated) of `uint8`, |
| 186 | the smallest integer sufficient to span the space of values. |
| 187 | If there were more values, or there were negatives ones, |
| 188 | the generated type of `_Pill_index` might change to `uint16` or `int8`: whatever works best. |
| 189 | |
| 190 | The approach used by the methods printed by `stringer` varies according to the properties of the constant set. |
| 191 | For instance, if the constants are sparse, it might use a map. |
| 192 | Here's a trivial example based on a constant set representing powers of two: |
| 193 | |
| 194 | const _Power_name = "p0p1p2p3p4p5..." |
| 195 | |
| 196 | var _Power_map = map[Power]string{ |
| 197 | 1: _Power_name[0:2], |
| 198 | 2: _Power_name[2:4], |
| 199 | 4: _Power_name[4:6], |
| 200 | 8: _Power_name[6:8], |
| 201 | 16: _Power_name[8:10], |
| 202 | 32: _Power_name[10:12], |
| 203 | ..., |
| 204 | } |
| 205 | |
| 206 | func (i Power) String() string { |
| 207 | if str, ok := _Power_map[i]; ok { |
| 208 | return str |
| 209 | } |
| 210 | return fmt.Sprintf("Power(%d)", i) |
| 211 | } |
| 212 | |
Rob Pike | 5216eb8 | 2014-12-22 13:33:16 +1100 | [diff] [blame] | 213 | In short, generating the method automatically allows us to do a better job than we would expect a human to do. |
| 214 | |
| 215 | There are lots of other uses of `go` `generate` already installed in the Go tree. |
| 216 | Examples include generating Unicode tables in the `unicode` package, |
| 217 | creating efficient methods for encoding and decoding arrays in `encoding/gob`, |
| 218 | producing time zone data in the `time` package, and so on. |
| 219 | |
| 220 | Please use `go` `generate` creatively. |
| 221 | It's there to encourage experimentation. |
| 222 | |
| 223 | And even if you don't, use the new `stringer` tool to write your `String` methods for your integer constants. |
| 224 | Let the machine do the work. |