Blame - _content/declaration-syntax.article - blog

blob: f1d057499a9ec5b9458e76c003f60a229f1e22aa [file] [log] [blame]

Russ Cox	af5018f	2020-03-09 23:54:35 -0400	[diff] [blame]	1	# Go's Declaration Syntax
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	2	7 Jul 2010
Andrew Gerrand	b316fcd	2013-06-05 09:59:16 +1000	[diff] [blame]	3	Tags: c, syntax, ethos
Russ Cox	faf1e2d	2020-03-14 09:44:01 -0400	[diff] [blame]	4	Summary: Why Go's declaration syntax doesn't look like, and is much simpler than, C's.
Russ Cox	972d42d	2020-03-15 15:50:36 -0400	[diff] [blame]	5	OldURL: /gos-declaration-syntax
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	6
				7	Rob Pike
				8
Russ Cox	af5018f	2020-03-09 23:54:35 -0400	[diff] [blame]	9	## Introduction
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	10
Russ Cox	482079d	2020-03-09 22:11:04 -0400	[diff] [blame]	11	Newcomers to Go wonder why the declaration syntax is different from the
				12	tradition established in the C family.
				13	In this post we'll compare the two approaches and explain why Go's declarations look as they do.
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	14
Russ Cox	af5018f	2020-03-09 23:54:35 -0400	[diff] [blame]	15	## C syntax
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	16
Russ Cox	482079d	2020-03-09 22:11:04 -0400	[diff] [blame]	17	First, let's talk about C syntax. C took an unusual and clever approach
				18	to declaration syntax.
				19	Instead of describing the types with special syntax,
				20	one writes an expression involving the item being declared,
				21	and states what type that expression will have. Thus
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	22
				23	int x;
				24
Russ Cox	482079d	2020-03-09 22:11:04 -0400	[diff] [blame]	25	declares x to be an int: the expression 'x' will have type int.
				26	In general, to figure out how to write the type of a new variable,
				27	write an expression involving that variable that evaluates to a basic type,
				28	then put the basic type on the left and the expression on the right.
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	29
				30	Thus, the declarations
				31
				32	int *p;
				33	int a[3];
				34
Russ Cox	af5018f	2020-03-09 23:54:35 -0400	[diff] [blame]	35	state that p is a pointer to int because '\*p' has type int,
Russ Cox	482079d	2020-03-09 22:11:04 -0400	[diff] [blame]	36	and that a is an array of ints because a[3] (ignoring the particular index value,
				37	which is punned to be the size of the array) has type int.
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	38
Russ Cox	482079d	2020-03-09 22:11:04 -0400	[diff] [blame]	39	What about functions? Originally, C's function declarations wrote the types
				40	of the arguments outside the parens, like this:
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	41
				42	int main(argc, argv)
				43	int argc;
				44	char *argv[];
				45	{ /* ... */ }
				46
Russ Cox	482079d	2020-03-09 22:11:04 -0400	[diff] [blame]	47	Again, we see that main is a function because the expression main(argc,
				48	argv) returns an int.
				49	In modern notation we'd write
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	50
				51	int main(int argc, char argv[]) { / ... */ }
				52
				53	but the basic structure is the same.
				54
Russ Cox	482079d	2020-03-09 22:11:04 -0400	[diff] [blame]	55	This is a clever syntactic idea that works well for simple types but can get confusing fast.
				56	The famous example is declaring a function pointer.
				57	Follow the rules and you get this:
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	58
				59	int (*fp)(int a, int b);
				60
Russ Cox	af5018f	2020-03-09 23:54:35 -0400	[diff] [blame]	61	Here, fp is a pointer to a function because if you write the expression (\*fp)(a,
Russ Cox	482079d	2020-03-09 22:11:04 -0400	[diff] [blame]	62	b) you'll call a function that returns int.
				63	What if one of fp's arguments is itself a function?
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	64
				65	int (fp)(int (ff)(int x, int y), int b)
				66
				67	That's starting to get hard to read.
				68
				69	Of course, we can leave out the name of the parameters when we declare a function, so main can be declared
				70
				71	int main(int, char *[])
				72
				73	Recall that argv is declared like this,
				74
				75	char *argv[]
				76
Russ Cox	482079d	2020-03-09 22:11:04 -0400	[diff] [blame]	77	so you drop the name from the middle of its declaration to construct its type.
Russ Cox	af5018f	2020-03-09 23:54:35 -0400	[diff] [blame]	78	It's not obvious, though, that you declare something of type char \*[] by
Russ Cox	482079d	2020-03-09 22:11:04 -0400	[diff] [blame]	79	putting its name in the middle.
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	80
				81	And look what happens to fp's declaration if you don't name the parameters:
				82
				83	int (fp)(int ()(int, int), int)
				84
				85	Not only is it not obvious where to put the name inside
				86
				87	int (*)(int, int)
				88
Russ Cox	482079d	2020-03-09 22:11:04 -0400	[diff] [blame]	89	it's not exactly clear that it's a function pointer declaration at all.
				90	And what if the return type is a function pointer?
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	91
				92	int ((fp)(int (*)(int, int), int))(int, int)
				93
				94	It's hard even to see that this declaration is about fp.
				95
Russ Cox	482079d	2020-03-09 22:11:04 -0400	[diff] [blame]	96	You can construct more elaborate examples but these should illustrate some
				97	of the difficulties that C's declaration syntax can introduce.
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	98
Russ Cox	482079d	2020-03-09 22:11:04 -0400	[diff] [blame]	99	There's one more point that needs to be made, though.
				100	Because type and declaration syntax are the same,
				101	it can be difficult to parse expressions with types in the middle.
				102	This is why, for instance, C casts always parenthesize the type, as in
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	103
				104	(int)M_PI
				105
Russ Cox	af5018f	2020-03-09 23:54:35 -0400	[diff] [blame]	106	## Go syntax
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	107
Russ Cox	482079d	2020-03-09 22:11:04 -0400	[diff] [blame]	108	Languages outside the C family usually use a distinct type syntax in declarations.
				109	Although it's a separate point, the name usually comes first,
				110	often followed by a colon.
				111	Thus our examples above become something like (in a fictional but illustrative language)
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	112
				113	x: int
				114	p: pointer to int
				115	a: array[3] of int
				116
Russ Cox	482079d	2020-03-09 22:11:04 -0400	[diff] [blame]	117	These declarations are clear, if verbose - you just read them left to right.
				118	Go takes its cue from here, but in the interests of brevity it drops the
				119	colon and removes some of the keywords:
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	120
				121	x int
				122	p *int
				123	a [3]int
				124
Russ Cox	482079d	2020-03-09 22:11:04 -0400	[diff] [blame]	125	There is no direct correspondence between the look of [3]int and how to
				126	use a in an expression.
				127	(We'll come back to pointers in the next section.) You gain clarity at the
				128	cost of a separate syntax.
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	129
Russ Cox	482079d	2020-03-09 22:11:04 -0400	[diff] [blame]	130	Now consider functions. Let's transcribe the declaration for main as it would read in Go,
				131	although the real main function in Go takes no arguments:
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	132
Rob Pike	f7678ed	2014-05-11 06:59:47 -0400	[diff] [blame]	133	func main(argc int, argv []string) int
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	134
Russ Cox	482079d	2020-03-09 22:11:04 -0400	[diff] [blame]	135	Superficially that's not much different from C,
				136	other than the change from `char` arrays to strings,
				137	but it reads well from left to right:
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	138
Rob Pike	f7678ed	2014-05-11 06:59:47 -0400	[diff] [blame]	139	function main takes an int and a slice of strings and returns an int.
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	140
				141	Drop the parameter names and it's just as clear - they're always first so there's no confusion.
				142
Rob Pike	f7678ed	2014-05-11 06:59:47 -0400	[diff] [blame]	143	func main(int, []string) int
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	144
Russ Cox	482079d	2020-03-09 22:11:04 -0400	[diff] [blame]	145	One merit of this left-to-right style is how well it works as the types
				146	become more complex.
				147	Here's a declaration of a function variable (analogous to a function pointer in C):
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	148
				149	f func(func(int,int) int, int) int
				150
				151	Or if f returns a function:
				152
				153	f func(func(int,int) int, int) func(int, int) int
				154
Russ Cox	482079d	2020-03-09 22:11:04 -0400	[diff] [blame]	155	It still reads clearly, from left to right,
				156	and it's always obvious which name is being declared - the name comes first.
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	157
				158	The distinction between type and expression syntax makes it easy to write and invoke closures in Go:
				159
				160	sum := func(a, b int) int { return a+b } (3, 4)
				161
Russ Cox	af5018f	2020-03-09 23:54:35 -0400	[diff] [blame]	162	## Pointers
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	163
Russ Cox	482079d	2020-03-09 22:11:04 -0400	[diff] [blame]	164	Pointers are the exception that proves the rule.
				165	Notice that in arrays and slices, for instance,
				166	Go's type syntax puts the brackets on the left of the type but the expression
				167	syntax puts them on the right of the expression:
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	168
				169	var a []int
				170	x = a[1]
				171
Russ Cox	af5018f	2020-03-09 23:54:35 -0400	[diff] [blame]	172	For familiarity, Go's pointers use the \* notation from C,
Russ Cox	482079d	2020-03-09 22:11:04 -0400	[diff] [blame]	173	but we could not bring ourselves to make a similar reversal for pointer types.
				174	Thus pointers work like this
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	175
				176	var p *int
				177	x = *p
				178
				179	We couldn't say
				180
				181	var p *int
				182	x = p*
				183
Russ Cox	af5018f	2020-03-09 23:54:35 -0400	[diff] [blame]	184	because that postfix \* would conflate with multiplication. We could have used the Pascal ^, for example:
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	185
				186	var p ^int
				187	x = p^
				188
Russ Cox	482079d	2020-03-09 22:11:04 -0400	[diff] [blame]	189	and perhaps we should have (and chosen another operator for xor),
				190	because the prefix asterisk on both types and expressions complicates things
				191	in a number of ways.
				192	For instance, although one can write
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	193
				194	[]int("hi")
				195
Russ Cox	af5018f	2020-03-09 23:54:35 -0400	[diff] [blame]	196	as a conversion, one must parenthesize the type if it starts with a \*:
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	197
				198	(*int)(nil)
				199
Russ Cox	af5018f	2020-03-09 23:54:35 -0400	[diff] [blame]	200	Had we been willing to give up \* as pointer syntax, those parentheses would be unnecessary.
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	201
Russ Cox	482079d	2020-03-09 22:11:04 -0400	[diff] [blame]	202	So Go's pointer syntax is tied to the familiar C form,
				203	but those ties mean that we cannot break completely from using parentheses
				204	to disambiguate types and expressions in the grammar.
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	205
				206	Overall, though, we believe Go's type syntax is easier to understand than C's, especially when things get complicated.
				207
Russ Cox	af5018f	2020-03-09 23:54:35 -0400	[diff] [blame]	208	## Notes
Andrew Gerrand	db9a09f	2013-03-08 11:17:09 +1100	[diff] [blame]	209
Russ Cox	482079d	2020-03-09 22:11:04 -0400	[diff] [blame]	210	Go's declarations read left to right. It's been pointed out that C's read in a spiral!
Russ Cox	af5018f	2020-03-09 23:54:35 -0400	[diff] [blame]	211	See [ The "Clockwise/Spiral Rule"](http://c-faq.com/decl/spiral.anderson.html) by David Anderson.