| <!--{ |
| "Title": "Go's Declaration Syntax" |
| }--> |
| |
| <p> |
| Newcomers to Go wonder why the declaration syntax is different from the |
| tradition established in the C family. In this post we'll compare the |
| two approaches and explain why Go's declarations look as they do. |
| </p> |
| |
| <p> |
| <b>C syntax</b> |
| </p> |
| |
| <p> |
| First, let's talk about C syntax. C took an unusual and clever approach |
| to declaration syntax. Instead of describing the types with special |
| syntax, one writes an expression involving the item being declared, and |
| states what type that expression will have. Thus |
| </p> |
| |
| <pre> |
| int x; |
| </pre> |
| |
| <p> |
| declares x to be an int: the expression 'x' will have type int. In |
| general, to figure out how to write the type of a new variable, write an |
| expression involving that variable that evaluates to a basic type, then |
| put the basic type on the left and the expression on the right. |
| </p> |
| |
| <p> |
| Thus, the declarations |
| </p> |
| |
| <pre> |
| int *p; |
| int a[3]; |
| </pre> |
| |
| <p> |
| state that p is a pointer to int because '*p' has type int, and that a |
| is an array of ints because a[3] (ignoring the particular index value, |
| which is punned to be the size of the array) has type int. |
| </p> |
| |
| <p> |
| What about functions? Originally, C's function declarations wrote the |
| types of the arguments outside the parens, like this: |
| </p> |
| |
| <pre> |
| int main(argc, argv) |
| int argc; |
| char *argv[]; |
| { /* ... */ } |
| </pre> |
| |
| <p> |
| Again, we see that main is a function because the expression main(argc, |
| argv) returns an int. In modern notation we'd write |
| </p> |
| |
| <pre> |
| int main(int argc, char *argv[]) { /* ... */ } |
| </pre> |
| |
| <p> |
| but the basic structure is the same. |
| </p> |
| |
| <p> |
| This is a clever syntactic idea that works well for simple types but can |
| get confusing fast. The famous example is declaring a function pointer. |
| Follow the rules and you get this: |
| </p> |
| |
| <pre> |
| int (*fp)(int a, int b); |
| </pre> |
| |
| <p> |
| Here, fp is a pointer to a function because if you write the expression |
| (*fp)(a, b) you'll call a function that returns int. What if one of fp's |
| arguments is itself a function? |
| </p> |
| |
| <pre> |
| int (*fp)(int (*ff)(int x, int y), int b) |
| </pre> |
| |
| <p> |
| That's starting to get hard to read. |
| </p> |
| |
| <p> |
| Of course, we can leave out the name of the parameters when we declare a |
| function, so main can be declared |
| </p> |
| |
| <pre> |
| int main(int, char *[]) |
| </pre> |
| |
| <p> |
| Recall that argv is declared like this, |
| </p> |
| |
| <pre> |
| char *argv[] |
| </pre> |
| |
| <p> |
| so you drop the name from the <em>middle</em> of its declaration to construct |
| its type. It's not obvious, though, that you declare something of type |
| char *[] by putting its name in the middle. |
| </p> |
| |
| <p> |
| And look what happens to fp's declaration if you don't name the |
| parameters: |
| </p> |
| |
| <pre> |
| int (*fp)(int (*)(int, int), int) |
| </pre> |
| |
| <p> |
| Not only is it not obvious where to put the name inside |
| </p> |
| |
| <pre> |
| int (*)(int, int) |
| </pre> |
| |
| <p> |
| it's not exactly clear that it's a function pointer declaration at all. |
| And what if the return type is a function pointer? |
| </p> |
| |
| <pre> |
| int (*(*fp)(int (*)(int, int), int))(int, int) |
| </pre> |
| |
| <p> |
| It's hard even to see that this declaration is about fp. |
| </p> |
| |
| <p> |
| You can construct more elaborate examples but these should illustrate |
| some of the difficulties that C's declaration syntax can introduce. |
| </p> |
| |
| <p> |
| There's one more point that needs to be made, though. Because type and |
| declaration syntax are the same, it can be difficult to parse |
| expressions with types in the middle. This is why, for instance, C casts |
| always parenthesize the type, as in |
| </p> |
| |
| <pre> |
| (int)M_PI |
| </pre> |
| |
| <p> |
| <b>Go syntax</b> |
| </p> |
| |
| <p> |
| Languages outside the C family usually use a distinct type syntax in |
| declarations. Although it's a separate point, the name usually comes |
| first, often followed by a colon. Thus our examples above become |
| something like (in a fictional but illustrative language) |
| </p> |
| |
| <pre> |
| x: int |
| p: pointer to int |
| a: array[3] of int |
| </pre> |
| |
| <p> |
| These declarations are clear, if verbose - you just read them left to |
| right. Go takes its cue from here, but in the interests of brevity it |
| drops the colon and removes some of the keywords: |
| </p> |
| |
| <pre> |
| x int |
| p *int |
| a [3]int |
| </pre> |
| |
| <p> |
| There is no direct correspondence between the look of [3]int and how to |
| use a in an expression. (We'll come back to pointers in the next |
| section.) You gain clarity at the cost of a separate syntax. |
| </p> |
| |
| <p> |
| Now consider functions. Let's transcribe the declaration for main, even |
| though the main function in Go takes no arguments: |
| </p> |
| |
| <pre> |
| func main(argc int, argv *[]byte) int |
| </pre> |
| |
| <p> |
| Superficially that's not much different from C, but it reads well from |
| left to right: |
| </p> |
| |
| <p> |
| <em>function main takes an int and a pointer to a slice of bytes and returns an int.</em> |
| </p> |
| |
| <p> |
| Drop the parameter names and it's just as clear - they're always first |
| so there's no confusion. |
| </p> |
| |
| <pre> |
| func main(int, *[]byte) int |
| </pre> |
| |
| <p> |
| One value of this left-to-right style is how well it works as the types |
| become more complex. Here's a declaration of a function variable |
| (analogous to a function pointer in C): |
| </p> |
| |
| <pre> |
| f func(func(int,int) int, int) int |
| </pre> |
| |
| <p> |
| Or if f returns a function: |
| </p> |
| |
| <pre> |
| f func(func(int,int) int, int) func(int, int) int |
| </pre> |
| |
| <p> |
| It still reads clearly, from left to right, and it's always obvious |
| which name is being declared - the name comes first. |
| </p> |
| |
| <p> |
| The distinction between type and expression syntax makes it easy to |
| write and invoke closures in Go: |
| </p> |
| |
| <pre> |
| sum := func(a, b int) int { return a+b } (3, 4) |
| </pre> |
| |
| <p> |
| <b>Pointers</b> |
| </p> |
| |
| <p> |
| Pointers are the exception that proves the rule. Notice that in arrays |
| and slices, for instance, Go's type syntax puts the brackets on the left |
| of the type but the expression syntax puts them on the right of the |
| expression: |
| </p> |
| |
| <pre> |
| var a []int |
| x = a[1] |
| </pre> |
| |
| <p> |
| For familiarity, Go's pointers use the * notation from C, but we could |
| not bring ourselves to make a similar reversal for pointer types. Thus |
| pointers work like this |
| </p> |
| |
| <pre> |
| var p *int |
| x = *p |
| </pre> |
| |
| <p> |
| We couldn't say |
| </p> |
| |
| <pre> |
| var p *int |
| x = p* |
| </pre> |
| |
| <p> |
| because that postfix * would conflate with multiplication. We could have |
| used the Pascal ^, for example: |
| </p> |
| |
| <pre> |
| var p ^int |
| x = p^ |
| </pre> |
| |
| <p> |
| and perhaps we should have (and chosen another operator for xor), |
| because the prefix asterisk on both types and expressions complicates |
| things in a number of ways. For instance, although one can write |
| </p> |
| |
| <pre> |
| []int("hi") |
| </pre> |
| |
| <p> |
| as a conversion, one must parenthesize the type if it starts with a *: |
| </p> |
| |
| <pre> |
| (*int)(nil) |
| </pre> |
| |
| <p> |
| Had we been willing to give up * as pointer syntax, those parentheses |
| would be unnecessary. |
| </p> |
| |
| <p> |
| So Go's pointer syntax is tied to the familiar C form, but those ties |
| mean that we cannot break completely from using parentheses to |
| disambiguate types and expressions in the grammar. |
| </p> |
| |
| <p> |
| Overall, though, we believe Go's type syntax is easier to understand |
| than C's, especially when things get complicated. |
| </p> |
| |
| <p> |
| <b>Notes</b> |
| </p> |
| |
| <p> |
| Go's declarations read left to right. It's been pointed out that C's |
| read in a spiral! See <a href="http://c-faq.com/decl/spiral.anderson.html"> |
| The "Clockwise/Spiral Rule"</a> by David Anderson. |
| </p> |