Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 1 | # Go's Declaration Syntax |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 2 | 7 Jul 2010 |
Andrew Gerrand | b316fcd | 2013-06-05 09:59:16 +1000 | [diff] [blame] | 3 | Tags: c, syntax, ethos |
Russ Cox | faf1e2d | 2020-03-14 09:44:01 -0400 | [diff] [blame] | 4 | Summary: Why Go's declaration syntax doesn't look like, and is much simpler than, C's. |
Russ Cox | 972d42d | 2020-03-15 15:50:36 -0400 | [diff] [blame] | 5 | OldURL: /gos-declaration-syntax |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 6 | |
| 7 | Rob Pike |
| 8 | |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 9 | ## Introduction |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 10 | |
Russ Cox | 482079d | 2020-03-09 22:11:04 -0400 | [diff] [blame] | 11 | Newcomers to Go wonder why the declaration syntax is different from the |
| 12 | tradition established in the C family. |
| 13 | In this post we'll compare the two approaches and explain why Go's declarations look as they do. |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 14 | |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 15 | ## C syntax |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 16 | |
Russ Cox | 482079d | 2020-03-09 22:11:04 -0400 | [diff] [blame] | 17 | First, let's talk about C syntax. C took an unusual and clever approach |
| 18 | to declaration syntax. |
| 19 | Instead of describing the types with special syntax, |
| 20 | one writes an expression involving the item being declared, |
| 21 | and states what type that expression will have. Thus |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 22 | |
| 23 | int x; |
| 24 | |
Russ Cox | 482079d | 2020-03-09 22:11:04 -0400 | [diff] [blame] | 25 | declares x to be an int: the expression 'x' will have type int. |
| 26 | In general, to figure out how to write the type of a new variable, |
| 27 | write an expression involving that variable that evaluates to a basic type, |
| 28 | then put the basic type on the left and the expression on the right. |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 29 | |
| 30 | Thus, the declarations |
| 31 | |
| 32 | int *p; |
| 33 | int a[3]; |
| 34 | |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 35 | state that p is a pointer to int because '\*p' has type int, |
Russ Cox | 482079d | 2020-03-09 22:11:04 -0400 | [diff] [blame] | 36 | and that a is an array of ints because a[3] (ignoring the particular index value, |
| 37 | which is punned to be the size of the array) has type int. |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 38 | |
Russ Cox | 482079d | 2020-03-09 22:11:04 -0400 | [diff] [blame] | 39 | What about functions? Originally, C's function declarations wrote the types |
| 40 | of the arguments outside the parens, like this: |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 41 | |
| 42 | int main(argc, argv) |
| 43 | int argc; |
| 44 | char *argv[]; |
| 45 | { /* ... */ } |
| 46 | |
Russ Cox | 482079d | 2020-03-09 22:11:04 -0400 | [diff] [blame] | 47 | Again, we see that main is a function because the expression main(argc, |
| 48 | argv) returns an int. |
| 49 | In modern notation we'd write |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 50 | |
| 51 | int main(int argc, char *argv[]) { /* ... */ } |
| 52 | |
| 53 | but the basic structure is the same. |
| 54 | |
Russ Cox | 482079d | 2020-03-09 22:11:04 -0400 | [diff] [blame] | 55 | This is a clever syntactic idea that works well for simple types but can get confusing fast. |
| 56 | The famous example is declaring a function pointer. |
| 57 | Follow the rules and you get this: |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 58 | |
| 59 | int (*fp)(int a, int b); |
| 60 | |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 61 | Here, fp is a pointer to a function because if you write the expression (\*fp)(a, |
Russ Cox | 482079d | 2020-03-09 22:11:04 -0400 | [diff] [blame] | 62 | b) you'll call a function that returns int. |
| 63 | What if one of fp's arguments is itself a function? |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 64 | |
| 65 | int (*fp)(int (*ff)(int x, int y), int b) |
| 66 | |
| 67 | That's starting to get hard to read. |
| 68 | |
| 69 | Of course, we can leave out the name of the parameters when we declare a function, so main can be declared |
| 70 | |
| 71 | int main(int, char *[]) |
| 72 | |
| 73 | Recall that argv is declared like this, |
| 74 | |
| 75 | char *argv[] |
| 76 | |
Russ Cox | 482079d | 2020-03-09 22:11:04 -0400 | [diff] [blame] | 77 | so you drop the name from the middle of its declaration to construct its type. |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 78 | It's not obvious, though, that you declare something of type char \*[] by |
Russ Cox | 482079d | 2020-03-09 22:11:04 -0400 | [diff] [blame] | 79 | putting its name in the middle. |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 80 | |
| 81 | And look what happens to fp's declaration if you don't name the parameters: |
| 82 | |
| 83 | int (*fp)(int (*)(int, int), int) |
| 84 | |
| 85 | Not only is it not obvious where to put the name inside |
| 86 | |
| 87 | int (*)(int, int) |
| 88 | |
Russ Cox | 482079d | 2020-03-09 22:11:04 -0400 | [diff] [blame] | 89 | it's not exactly clear that it's a function pointer declaration at all. |
| 90 | And what if the return type is a function pointer? |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 91 | |
| 92 | int (*(*fp)(int (*)(int, int), int))(int, int) |
| 93 | |
| 94 | It's hard even to see that this declaration is about fp. |
| 95 | |
Russ Cox | 482079d | 2020-03-09 22:11:04 -0400 | [diff] [blame] | 96 | You can construct more elaborate examples but these should illustrate some |
| 97 | of the difficulties that C's declaration syntax can introduce. |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 98 | |
Russ Cox | 482079d | 2020-03-09 22:11:04 -0400 | [diff] [blame] | 99 | There's one more point that needs to be made, though. |
| 100 | Because type and declaration syntax are the same, |
| 101 | it can be difficult to parse expressions with types in the middle. |
| 102 | This is why, for instance, C casts always parenthesize the type, as in |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 103 | |
| 104 | (int)M_PI |
| 105 | |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 106 | ## Go syntax |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 107 | |
Russ Cox | 482079d | 2020-03-09 22:11:04 -0400 | [diff] [blame] | 108 | Languages outside the C family usually use a distinct type syntax in declarations. |
| 109 | Although it's a separate point, the name usually comes first, |
| 110 | often followed by a colon. |
| 111 | Thus our examples above become something like (in a fictional but illustrative language) |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 112 | |
| 113 | x: int |
| 114 | p: pointer to int |
| 115 | a: array[3] of int |
| 116 | |
Russ Cox | 482079d | 2020-03-09 22:11:04 -0400 | [diff] [blame] | 117 | These declarations are clear, if verbose - you just read them left to right. |
| 118 | Go takes its cue from here, but in the interests of brevity it drops the |
| 119 | colon and removes some of the keywords: |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 120 | |
| 121 | x int |
| 122 | p *int |
| 123 | a [3]int |
| 124 | |
Russ Cox | 482079d | 2020-03-09 22:11:04 -0400 | [diff] [blame] | 125 | There is no direct correspondence between the look of [3]int and how to |
| 126 | use a in an expression. |
| 127 | (We'll come back to pointers in the next section.) You gain clarity at the |
| 128 | cost of a separate syntax. |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 129 | |
Russ Cox | 482079d | 2020-03-09 22:11:04 -0400 | [diff] [blame] | 130 | Now consider functions. Let's transcribe the declaration for main as it would read in Go, |
| 131 | although the real main function in Go takes no arguments: |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 132 | |
Rob Pike | f7678ed | 2014-05-11 06:59:47 -0400 | [diff] [blame] | 133 | func main(argc int, argv []string) int |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 134 | |
Russ Cox | 482079d | 2020-03-09 22:11:04 -0400 | [diff] [blame] | 135 | Superficially that's not much different from C, |
| 136 | other than the change from `char` arrays to strings, |
| 137 | but it reads well from left to right: |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 138 | |
Rob Pike | f7678ed | 2014-05-11 06:59:47 -0400 | [diff] [blame] | 139 | function main takes an int and a slice of strings and returns an int. |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 140 | |
| 141 | Drop the parameter names and it's just as clear - they're always first so there's no confusion. |
| 142 | |
Rob Pike | f7678ed | 2014-05-11 06:59:47 -0400 | [diff] [blame] | 143 | func main(int, []string) int |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 144 | |
Russ Cox | 482079d | 2020-03-09 22:11:04 -0400 | [diff] [blame] | 145 | One merit of this left-to-right style is how well it works as the types |
| 146 | become more complex. |
| 147 | Here's a declaration of a function variable (analogous to a function pointer in C): |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 148 | |
| 149 | f func(func(int,int) int, int) int |
| 150 | |
| 151 | Or if f returns a function: |
| 152 | |
| 153 | f func(func(int,int) int, int) func(int, int) int |
| 154 | |
Russ Cox | 482079d | 2020-03-09 22:11:04 -0400 | [diff] [blame] | 155 | It still reads clearly, from left to right, |
| 156 | and it's always obvious which name is being declared - the name comes first. |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 157 | |
| 158 | The distinction between type and expression syntax makes it easy to write and invoke closures in Go: |
| 159 | |
| 160 | sum := func(a, b int) int { return a+b } (3, 4) |
| 161 | |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 162 | ## Pointers |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 163 | |
Russ Cox | 482079d | 2020-03-09 22:11:04 -0400 | [diff] [blame] | 164 | Pointers are the exception that proves the rule. |
| 165 | Notice that in arrays and slices, for instance, |
| 166 | Go's type syntax puts the brackets on the left of the type but the expression |
| 167 | syntax puts them on the right of the expression: |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 168 | |
| 169 | var a []int |
| 170 | x = a[1] |
| 171 | |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 172 | For familiarity, Go's pointers use the \* notation from C, |
Russ Cox | 482079d | 2020-03-09 22:11:04 -0400 | [diff] [blame] | 173 | but we could not bring ourselves to make a similar reversal for pointer types. |
| 174 | Thus pointers work like this |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 175 | |
| 176 | var p *int |
| 177 | x = *p |
| 178 | |
| 179 | We couldn't say |
| 180 | |
| 181 | var p *int |
| 182 | x = p* |
| 183 | |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 184 | because that postfix \* would conflate with multiplication. We could have used the Pascal ^, for example: |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 185 | |
| 186 | var p ^int |
| 187 | x = p^ |
| 188 | |
Russ Cox | 482079d | 2020-03-09 22:11:04 -0400 | [diff] [blame] | 189 | and perhaps we should have (and chosen another operator for xor), |
| 190 | because the prefix asterisk on both types and expressions complicates things |
| 191 | in a number of ways. |
| 192 | For instance, although one can write |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 193 | |
| 194 | []int("hi") |
| 195 | |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 196 | as a conversion, one must parenthesize the type if it starts with a \*: |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 197 | |
| 198 | (*int)(nil) |
| 199 | |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 200 | Had we been willing to give up \* as pointer syntax, those parentheses would be unnecessary. |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 201 | |
Russ Cox | 482079d | 2020-03-09 22:11:04 -0400 | [diff] [blame] | 202 | So Go's pointer syntax is tied to the familiar C form, |
| 203 | but those ties mean that we cannot break completely from using parentheses |
| 204 | to disambiguate types and expressions in the grammar. |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 205 | |
| 206 | Overall, though, we believe Go's type syntax is easier to understand than C's, especially when things get complicated. |
| 207 | |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 208 | ## Notes |
Andrew Gerrand | db9a09f | 2013-03-08 11:17:09 +1100 | [diff] [blame] | 209 | |
Russ Cox | 482079d | 2020-03-09 22:11:04 -0400 | [diff] [blame] | 210 | Go's declarations read left to right. It's been pointed out that C's read in a spiral! |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 211 | See [ The "Clockwise/Spiral Rule"](http://c-faq.com/decl/spiral.anderson.html) by David Anderson. |