doc/go_tutorial.txt - go - Git at Google

 Let's Go
 ----

 Rob Pike

 ----
 (September 14, 2008)


 This document is a tutorial introduction to the basics of the Go systems programming
 language, intended for programmers familiar with C or C++. It is not a comprehensive
 guide to the language; at the moment the document closest to that is the draft
 specification:

 	/doc/go_spec.html

 To check out the compiler and tools and be ready to run Go programs, see

 	/doc/go_setup.html

 The presentation proceeds through a series of modest programs to illustrate
 key features of the language.  All the programs work (at time of writing) and are
 checked in at

 	/doc/progs

 Program snippets are annotated with the line number in the original file; for
 cleanliness, blank lines remain blank.

 Hello, World
 ----

 Let's start in the usual way:

 --PROG progs/helloworld.go

 Every Go source file declares which package it's part of using a "package" statement.
 The "main" package's "main" function is where the program starts running (after
 any initialization).

 Function declarations are introduced with the "func" keyword.

 Notice that string constants can contain Unicode characters, encoded in UTF-8.
 Go is defined to accept UTF-8 input.  Strings are arrays of bytes, usually used
 to store Unicode strings represented in UTF-8.

 The built-in function "print()" has been used during the early stages of
 development of the language but is not guaranteed to last.  Here's a better version of the
 program that doesn't depend on "print()":

 --PROG progs/helloworld2.go

 This version imports the ''os'' package to acess its "Stdout" variable, of type
 "*OS.FD".  The "import" statement is a declaration: it names the identifier ("OS")
 that will be used to access members of the package imported from the file (&quot;os&quot;),
 found in the current directory or in a standard location.
 Given "OS.Stdout" we can use its "WriteString" method to print the string.

 The comment convention is the same as in C++:

 	/* ... */
 	// ...

 Echo
 ----

 Next up, here's a version of the Unix utility "echo(1)":

 --PROG progs/echo.go

 It's still fairly small but it's doing a number of new things.  In the last example,
 we saw "func" introducing a function.  The keywords "var", "const", and "type"
 (not used yet) also introduce declarations, as does "import".
 Notice that we can group declarations of the same sort into
 parenthesized, semicolon-separated lists if we want, as on lines 3-6 and 10-13.
 But it's not necessary to do so; we could have said

 	const Space = " "
 	const Newline = "\n"

 Semicolons aren't needed here; in fact, semicolons are unnecessary after any
 top-level declaration, even though they are needed as separators <i>within</i>
 a parenthesized list of declarations.

 Having imported the "Flag" package, line 8 creates a global variable to hold
 the value of echo's -n flag.  (The nil hides a nice feature not needed here;
 see the source in "src/lib/flag.go" for details).

 In "main.main", we parse the arguments (line 16) and then create a local
 string variable we will use to build the output.

 The declaration statement has the form

 	var s string = "";

 This is the "var" keyword, followed by the name of the variable, followed by
 its type, followed by an equals sign and an initial value for the variable.

 Go tries to be terse, and this declaration could be shortened.  Since the
 string constant is of type string, we don't have to tell the compiler that.
 We could write

 	var s = "";

 or we could go even shorter and write the idiom

 	s := "";

 The := operator is used a lot in Go to represent an initializing declaration.
 (For those who know Limbo, its := construct is the same, but notice
 that Go has no colon after the name in a full "var" declaration.)
 And there's one in the "for" clause on the next line:

 --PROG  progs/echo.go /for/

 The "Flag" package has parsed the arguments and left the non-flag arguments
 in a list that can be iterated over in the obvious way.

 The Go "for" statement differs from that of C in a number of ways.  First,
 it's the only looping construct; there is no "while" or "do".  Second,
 there are no parentheses on the clause, but the braces on the body
 are mandatory.  (The same applies to the "if" statement.) Later examples
 will show some other ways "for" can be written.

 The body of the loop builds up the string "s" by appending (using +=)
 the flags and separating spaces. After the loop, if the "-n" flag is not
 set, it appends a newline, and then writes the result.

 Notice that "main.main" is a niladic function with no return type.
 It's defined that way.  Falling off the end of "main.main" means
 ''success''; if you want to signal erroneous return, use

 	sys.exit(1)

 The "sys" package is built in and contains some essentials for getting
 started; for instance, "sys.argc()" and "sys.argv(int)" are used by the
 "Flag" package to access the arguments.

 An Interlude about Types
 ----

 Go has some familiar types such as "int" and "float", which represent
 values of the ''appropriate'' size for the machine. It also defines
 specifically-sized types such as "int8", "float64", and so on, plus
 unsigned integer types such as "uint", "uint32", etc.  And then there
 is a "byte" synonym for "uint8", which is the element type for
 strings.

 Speaking of "string", that's a built-in type as well.  Strings are
 <i>immutable values</i> -- they are not just arrays of "byte" values.
 Once you've built a string <i>value</i>, you can't change it, although
 of course you can change a string <i>variable</i> simply by
 reassigning it.  This snippet from "strings.go" is legal code:

 --PROG progs/strings.go /hello/ /ciao/

 However the following statements are illegal because they would modify
 a "string" value:

 	s[0] = 'x';
 	(*p)[1] = 'y';

 In C++ terms, Go strings are a bit like "const strings", while pointers
 to strings are analogous to "const string" references.

 Yes, there are pointers.  However, Go simplifies their use a little;
 read on.

 Arrays are declared like this:

 	var array_of_int [10]int;

 Arrays, like strings, are values, but they are mutable. This differs
 from C, in which "array_of_int" would be usable as a pointer to "int".
 In Go, since arrays are values, it's meaningful (and useful) to talk
 about pointers to arrays.

 The size of the array is part of its type; however, one can declare
 an <i>open array</i> variable, to which one can assign any array value
 with the same element type.
 (At the moment, only <i>pointers</i> to open arrays are implemented.)
 Thus one can write this function (from "sum.go"):

 --PROG progs/sum.go /sum/ /^}/

 and invoke it like this:

 --PROG progs/sum.go /1,2,3/

 Note how the return type ("int") is defined for "sum()" by stating it
 after the parameter list.  Also observe that although the argument
 is a pointer to an array, we can index it directly ("a[i]" not "(*a)[i]").
 The expression "[]int{1,2,3}" -- a type followed by a brace-bounded expression
 -- is a constructor for a value, in this case an array of "int". We pass it
 to "sum()" by taking its address.

 The built-in function "len()" appeared there too - it works on strings,
 arrays, and maps, which can be built like this:

 	m := map[string] int {"one":1 , "two":2}

 At least for now, maps are <i>always</i> pointers, so in this example
 "m" has type "*map[string]int".  This may change.

 You can also create a map (or anything else) with the built-in "new()"
 function:

 	m := new(map[string] int)

 The "new()" function always returns a pointer, an address for the object
 it creates.

 An Interlude about Constants
 ----

 Although integers come in lots of sizes in Go, integer constants do not.
 There are no constants like "0ll" or "0x0UL".   Instead, integer
 constants are evaluated as ideal, large-precision values that
 can overflow only when they are assigned to an integer variable with
 too little precision to represent the value.

 	const hard_eight = (1 << 100) >> 97  // legal

 There are nuances that deserve redirection to the legalese of the
 language specification but here are some illustrative examples:

 	var a uint64 = 0  // a has type uint64, value 0
 	a := uint64(0)    // equivalent; uses a "conversion"
 	i := 0x1234       // i gets default type: int
 	var j int = 1e6   // legal - 1000000 is representable in an int
 	x := 1.5          // a float
 	i3div2 = 3/2      // integer division - result is 1
 	f3div2 = 3./2.    // floating point division - result is 1.5

 Conversions only work for simple cases such as converting ints of one
 sign or size to another, and between ints and floats, plus a few other
 simple cases.  There are no automatic conversions of any kind in Go,
 other than that of making constants have concrete size and type when
 assigned to a variable.

 An I/O Package
 ----

 Next we'll look at a simple package for doing file I/O with the usual
 sort of open/close/read/write interface.  Here's the start of "fd.go":

 --PROG progs/fd.go /package/ /^}/

 The first line declares the name of the package -- "fd" for ''file descriptor'' --
 and then we import the low-level, external "syscall" package, which provides
 a primitive interface to the underlying operating system's calls.

 Next is a type definition: the "type" keyword introduces a type declaration,
 in this case a data structure called "FD".
 To make things a little more interesting, our "FD" includes the name of the file
 that the file descriptor refers to.  The "export" keyword makes the declared
 structure visible to users of the package.

 Now we can write what is often called a factory:

 --PROG progs/fd.go /NewFD/ /^}/

 This returns a pointer to a new "FD" structure with the file descriptor and name
 filled in.  We can use it to construct some familiar, exported variables of type "*FD":

 --PROG progs/fd.go /export.var/ /^.$/

 The "NewFD" function was not exported because it's internal. The proper factory
 to use is "Open":

 --PROG progs/fd.go /func.Open/ /^}/

 There are a number of new things in these few lines.  First, "Open" returns
 multiple values, an "FD" and an "errno" (Unix error number).  We declare the
 multi-value return as a parenthesized list of declarations.  "Syscall.open"
 also has a multi-value return, which we can grab with the multi-variable
 declaration on line 27; it declares "r" and "e" to hold the two values,
 both of type "int64" (although you'd have to look at the "syscall" package
 to see that).  Finally, line 28 returns two values: a pointer to the new "FD"
 and the return code.  If "Syscall.open" failed, the file descriptor "r" will
 be negative and "NewFD" will return "nil".

 Now that we can build "FDs", we can write methods to use them. To declare
 a method of a type, we define a function to have an explicit receiver
 of that type, placed
 in parentheses before the function name. Here are some methods for "FD",
 each of which declares a receiver variable "fd".

 --PROG progs/fd.go /Close/ END

 There is no implicit "this" and the receiver variable must be used to access
 members of the structure.  Methods are not declared within
 the "struct" declaration itself.  The "struct" declaration defines only data members.

 Finally, we can use our new package:

 --PROG progs/helloworld3.go

 and run the program:

 	% helloworld3
 	hello, world
 	can't open file; errno=2
 	%

 Rotting cats
 ----

 Building on the FD package, here's a simple version of the Unix utility "cat(1)", "progs/cat.go":

 --PROG progs/cat.go

 By now this should be easy to follow, but the "switch" statement introduces some
 new features.  Like a "for" loop, an "if" or "switch" can include an
 initialization statement.  The "switch" on line 12 uses one to create variables
 "nr" and "er" to hold the return values from "fd.Read()".  (The "if" on line 19
 has the same idea.)  The "switch" statement is general: it evaluates the cases
 from  top to bottom looking for the first case that matches the value; the
 case expressions don't need to be constants or even integers, as long as
 they all have the same type.

 Since the "switch" value is just "true", we could leave it off -- as is also true
 in a "for" statement, a missing value means "true".  In fact, such a "switch"
 is a form of "if-else" chain.

 Line 19 calls "Write()" by slicing (a pointer to) the array, creating a
 <i>reference slice</i>.

 Now let's make a variant of "cat" that optionally does "rot13" on its input.
 It's easy to do by just processing the bytes, but instead we will exploit
 Go's notion of an <i>interface</i>.

 The "cat()" subroutine uses only two methods of "fd": "Read()" and "Name()",
 so let's start by defining an interface that has exactly those two methods.
 Here is code from "progs/cat_rot13.go":

 --PROG progs/cat_rot13.go /type.Reader/ /^}/

 Any type that implements the two methods of "Reader" -- regardless of whatever
 other methods the type may also contain -- is said to <i>implement</i> the
 interface.  Since "FD.FD" implements these methods, it implements the
 "Reader" interface.  We could tweak the "cat" subroutine to accept a "Reader"
 instead of a "*FD.FD" and it would work just fine, but let's embellish a little
 first by writing a second type that implements "Reader", one that wraps an
 existing "Reader" and does "rot13" on the data. To do this, we just define
 the type and implement the methods and with no other bookkeeping,
 we have a second implementation of the "Reader" interface.

 --PROG progs/cat_rot13.go /type.Rot13/ /end.of.Rot13/

 (The "rot13" function called on line 38 is trivial and not worth reproducing.)

 To use the new feature, we define a flag:

 --PROG progs/cat_rot13.go /rot13_flag/

 and use it from within a mostly unchanged "cat()" function:

 --PROG progs/cat_rot13.go /func.cat/ /^}/

 (We could also do the wrapping in "main" and leave "cat()" mostly alone, except
 for changing the type of the argument.)
 Lines 53 and 54 set it all up: If the "rot13" flag is true, wrap the "Reader"
 we received into a "Rot13" and proceed.  Note that the interface variables
 are values, not pointers: the argument is of type "Reader", not "*Reader",
 even though under the covers it holds a pointer to a "struct".

 Here it is in action:

 <pre>
 	% echo abcdefghijklmnopqrstuvwxyz | ./cat
 	abcdefghijklmnopqrstuvwxyz
 	% echo abcdefghijklmnopqrstuvwxyz | ./cat --rot13
 	nopqrstuvwxyzabcdefghijklm
 	%
 </pre>

 Fans of dependency injection may take cheer from how easily interfaces
 allow us to substitute the implementation of a file descriptor.

 Interfaces are a distinct feature of Go.  An interface is implemented by a
 type if the type implements all the methods declared in the interface.
 This means
 that a type may implement an arbitrary number of different interfaces.
 There is no type hierarchy; things can be much more <i>ad hoc</i>,
 as we saw with "rot13".  "FD.FD" implements "Reader"; it could also
 implement a "Writer", or any other interface built from its methods that
 fits the current situation. Consider the <i>empty interface</i>

 <pre>
 	type interface Empty {}
 </pre>

 <i>Every</i> type implements the empty interface, which makes it
 useful for things like containers.

 Sorting
 ----

 As another example of interfaces, consider this simple sort algorithm,
 taken from "progs/sort.go":

 --PROG progs/sort.go /func.Sort/ /^}/

 The code needs only three methods, which we wrap into "SortInterface":

 --PROG progs/sort.go /interface/ /^}/

 We can apply "Sort" to any type that implements "len", "less", and "swap".
 The "sort" package includes the necessary methods to allow sorting of
 arrays of integers, strings, etc.; here's the code for arrays of "int":

 --PROG progs/sort.go /type.*IntArray/ /swap/

 And now a routine to test it out, from "progs/sortmain.go".  This
 uses a function in the "sort" package, omitted here for brevity,
 to test that the result is sorted.

 --PROG progs/sortmain.go /func.ints/ /^}/

 If we have a new type we want to be able to sort, all we need to do is
 to implement the three methods for that type, like this:

 --PROG progs/sortmain.go /type.Day/ /swap/

 Prime numbers
 ----

 Now we come to processes and communication -- concurrent programming.
 It's a big subject so to be brief we assume some familiarity with the topic.

 A classic program in the style is the prime sieve of Eratosthenes.
 It works by taking a stream of all the natural numbers, and introducing
 a sequence of filters, one for each prime, to winnow the multiples of
 that prime.  At each step we have a sequence of filters of the primes
 so far, and the next number to pop out is the next prime, which triggers
 the creation of the next filter in the chain.

 Here's a flow diagram; each box represents a filter element whose
 creation is triggered by the first number that flowed from the
 elements before it.

 <br>

 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<img src='sieve.gif'>

 <br>

 To create a stream of integers, we use a Go <i>channel</i>, which,
 borrowing from CSP's descendants, represents a communications
 channel that can connect two concurrent computations.
 In Go, channel variables are
 always pointers to channels -- it's the object they point to that
 does the communication.

 Here is the first function in "progs/sieve.go":

 --PROG progs/sieve.go /Send/ /^}/

 The function "Generate" sends the sequence 2, 3, 4, 5, ... to its
 argument channel, "ch", using the binary communications operator "&lt;-".
 Channels block, so if there's no recipient for the the value on "ch",
 the send operation will wait until one becomes available.

 The "Filter" function has three arguments: an input channel, an output
 channel, and a prime number.  It copies values from the input to the
 output, discarding anything divisible by the prime.  The unary communications
 operator "&lt;-" (receive) retrieves the next value on the channel.

 --PROG progs/sieve.go /Copy/ /^}/

 The generator and filters execute concurrently.  Go has
 its own model of process/threads/light-weight processes/coroutines,
 so to avoid notational confusion we'll call concurrently executing
 computations in Go <i>goroutines</i>.  To start a goroutine,
 invoke the function, prefixing the call with the keyword "go";
 this starts the function running in parallel with the current
 computation but in the same address space:

 	go sum(huge_array); // calculate sum in the background

 If you want to know when the calculation is done, pass a channel
 on which it can report back:

 	ch := new(chan int);
 	go sum(huge_array, ch);
 	// ... do something else for a while
 	result := <-ch;  // wait for, and retrieve, result

 Back to our prime sieve.  Here's how the sieve pipeline is stitched
 together:

 --PROG progs/sieve.go /func.main/ /^}/

 Line 23 creates the initial channel to pass to "Generate", which it
 then starts up.  As each prime pops out of the channel, a new "Filter"
 is added to the pipeline and <i>its</i> output becomes the new value
 of "ch".

 The sieve program can be tweaked to use a pattern common
 in this style of programming.  Here is a variant version
 of "Generate", from "progs/sieve1.go":

 --PROG progs/sieve1.go /func.Generate/ /^}/

 This version does all the setup internally. It creates the output
 channel, launches a goroutine internally using a function literal, and
 returns the channel to the caller.  It is a factory for concurrent
 execution, starting the goroutine and returning its connection.
 The same
 change can be made to "Filter":

 --PROG progs/sieve1.go /func.Filter/ /^}/

 The "Sieve" function's main loop becomes simpler and clearer as a
 result, and while we're at it let's turn it into a factory too:

 --PROG progs/sieve1.go /func.Sieve/ /^}/

 Now "main"'s interface to the prime sieve is a channel of primes:

 --PROG progs/sieve1.go /func.main/ /^}/

 Multiplexing
 ----

 With channels, it's possible to serve multiple independent client goroutines without
 writing an actual multiplexer.  The trick is to send the server a channel in the message,
 which it will then use to reply to the original sender.
 A realistic client-server program is a lot of code, so here is a very simple substitute
 to illustrate the idea.  It starts by defining a "Request" type, which embeds a channel
 that will be used for the reply.

 --PROG progs/server.go /type.Request/ /^}/

 The server will be trivial: it will do simple binary operations on integers.  Here's the
 code that invokes the operation and responds to the request:

 --PROG progs/server.go /type.BinOp/ /^}/

 The "Server" routine loops forever, receiving requests and, to avoid blocking due to
 a long-running operation, starting a goroutine to do the actual work.

 --PROG progs/server.go /func.Server/ /^}/

 We construct a server in a familiar way, starting it up and returning a channel to
 connect to it:

 --PROG progs/server.go /func.StartServer/ /^}/

 Here's a simple test.  It starts a server with an addition operator, and sends out
 lots of requests but doesn't wait for the reply.  Only after all the requests are sent
 does it check the results.

 --PROG progs/server.go /func.main/ /^}/

 One annoyance with this program is that it doesn't exit cleanly; when "main" returns
 there are a number of lingering goroutines blocked on communication.  To solve this,
 we provide a second, "quit" channel to the server:

 --PROG progs/server1.go /func.StartServer/ /^}/

 It passes the quit channel to the "Server" function, which uses it like this:

 --PROG progs/server1.go /func.Server/ /^}/

 Inside "Server", a "select" statement chooses which of the multiple communications
 listed by its cases can proceed.  If all are blocked, it waits until one can proceed; if
 multiple can proceed, it chooses one at random.  In this instance, the "select" allows
 the server to honor requests until it receives a quit message, at which point it
 returns, terminating its execution.


 All that's left is to strobe the "quit" channel
 at the end of main:

 --PROG progs/server1.go /adder,.quit/
 ...
 --PROG progs/server1.go /quit....true/

 There's a lot more to Go programming and concurrent programming in general but this
 quick tour should give you some of the basics.