blob: 5315c8a05523e1d98a519b0ed660f4660b8c513e [file] [log] [blame]
The Go Programming Language Specification (DRAFT)
----
Robert Griesemer, Rob Pike, Ken Thompson
----
(November 17, 2008)
This document is a semi-formal specification of the Go systems
programming language.
<font color=red>
This document is not ready for external review, it is under active development.
Any part may change substantially as design progresses.
</font>
<!--
Timeline (9/5/08):
- threads: 1 month
- reflection code: 2 months
- proto buf support: 3 months
- GC: 6 months
- debugger
- Jan 1, 2009: enough support to write interesting programs
Missing:
[ ] partial export of structs, methods
[ ] range statement: to be defined more reasonably
[ ] packages of multiple files
[ ] Helper syntax for composite types: allow names/indices for maps/arrays,
remove need for type in elements of composites
Todo's:
[ ] clarification on interface types, rules
[ ] clarify slice rules
[ ] clarify tuples
[ ] need to talk about precise int/floats clearly
[ ] iant suggests to use abstract/precise int for len(), cap() - good idea
(issue: what happens in len() + const - what is the type?)
[ ] need to be specific on (unsigned) integer operations: one must be able
to rely on wrap-around on overflow
[ ] what are the permissible ranges for the indices in slices? The spec
doesn't correspond to the implementation. The spec is wrong when it
comes to the first index i: it should allow (at least) the range 0 <= i <= len(a).
also: document different semantics for strings and arrays (strings cannot be grown).
Open issues:
[ ] semantics of type decl and where methods are attached
what about: type MyInt int (does it produce a new (incompatible) int)?
[ ] convert should not be used for composite literals anymore,
in fact, convert() should go away
[ ] if statement: else syntax must be fixed
[ ] old-style export decls (still needed, but ideally should go away)
[ ] like to have assert() in the language, w/ option to disable code gen for it
[ ] composite types should uniformly create an instance instead of a pointer
[ ] semantics of statements
[ ] need for type switch? (or use type guard with ok in tuple assignment?)
[ ] do we need anything on package vs file names?
[ ] type switch or some form of type test needed
[ ] what is the meaning of typeof()
[ ] at the moment: type T S; strips any methods of S. It probably shouldn't.
[ ] 6g allows: interface { f F } where F is a function type. fine, but then we should
also allow: func f F {}, where F is a function type.
[ ] provide composite literal notation to address array indices: []int{ 0: x1, 1: x2, ... }
and struct field names (both seem easy to do).
[ ] reopening & and func issue: Seems inconsistent as both &func(){} and func(){} are
permitted. Suggestion: func literals are pointers. We need to use & for all other
functions. This would be in consistency with the declaration of function pointer
variables and the use of '&' to convert methods into function pointers.
[ ] Conversions: can we say: "type T int; T(3.0)" ?
We could allow converting structurally equivalent types into each other this way.
May play together with "type T1 T2" where we give another type name to T2.
[ ] Is . import implemented / do we still need it?
[ ] Do we allow empty statements? If so, do we allow empty statements after a label?
and if so, does a label followed by an empty statement (a semicolon) still denote
a for loop that is following, and can break L be used inside it?
[ ] comparison of non-basic types: what do we allow? what do we allow in interfaces
what about maps (require ==, copy and hash)
maybe: no maps with non-basic type keys, and no interface comparison unless
with nil
[ ] consider syntactic notation for composite literals to make them parseable w/o type information
(require ()'s in control clauses)
[ ] global var decls: "var a, b, c int = 0, 0, 0" is ok, but "var a, b, c = 0, 0, 0" is not
(seems inconsistent with "var a = 0", and ":=" notation)
[ ] const decls: "const a, b = 1, 2" is not allowed - why not? Should be symmetric to vars.
Decisions in need of integration into the doc:
[ ] pair assignment is required to get map, and receive ok.
[ ] len() returns an int, new(array_type, n) n must be an int
[ ] passing a "..." arg to another "..." parameter doesn't wrap the argument again
(so "..." args can be passed down easily)
Closed:
[x] new(arraytype, n1, n2): spec only talks about length, not capacity
(should only use new(arraytype, n) - this will allow later
extension to multi-dim arrays w/o breaking the language) - documented
[x] should we have a shorter list of alias types? (byte, int, uint, float) - done
[x] reflection support
[x] syntax for var args
[x] Do composite literals create a new literal each time (gri thinks yes) (Russ is putting in a change
to this effect, essentially)
[x] comparison operators: can we compare interfaces?
[x] can we add methods to types defined in another package? (probably not)
[x] optional semicolons: too complicated and unclear
[x] anonymous types are written using a type name, which can be a qualified identifier.
this might be a problem when referring to such a field using the type name.
[x] nil and interfaces - can we test for nil, what does it mean, etc.
[x] talk about underflow/overflow of 2's complement numbers (defined vs not defined).
[x] change wording on array composite literals: the types are always fixed arrays
for array composites
[x] meaning of nil
[x] remove "any"
[x] methods for all types
[x] should binary <- be at lowest precedence level? when is a send/receive non-blocking? (NO - 9/19/08)
[x] func literal like a composite type - should probably require the '&' to get address (NO)
[x] & needed to get a function pointer from a function? (NO - there is the "func" keyword - 9/19/08)
-->
Contents
----
Introduction
Notation
Source code representation
Characters
Letters and digits
Vocabulary
Identifiers
Numeric literals
Character and string literals
Operators and delimitors
Reserved words
Declarations and scope rules
Predeclared identifiers
Exported declarations
Const declarations
Type declarations
Variable declarations
Export declarations
Types
Basic types
Arithmetic types
Booleans
Strings
Array types
Struct types
Pointer types
Map types
Channel types
Function types
Interface types
Type equality
Expressions
Operands
Constants
Qualified identifiers
Iota
Composite Literals
Function Literals
Primary expressions
Selectors
Indexes
Slices
Type guards
Calls
Parameter passing
Operators
Arithmetic operators
Comparison operators
Logical operators
Address operators
Communication operators
Constant expressions
Statements
Label declarations
Expression statements
IncDec statements
Assignments
If statements
Switch statements
For statements
Range statements
Go statements
Select statements
Return statements
Break statements
Continue statements
Label declaration
Goto statements
Function declarations
Method declarations
Predeclared functions
Length and capacity
Conversions
Allocation
Packages
Program initialization and execution
----
Introduction
----
Notation
----
The syntax is specified using Parameterized Extended Backus-Naur Form (PEBNF).
Specifically, productions are expressions constructed from terms and the
following operators:
- | separates alternatives (least binding strength)
- () groups
- [] specifies an option (0 or 1 times)
- {} specifies repetition (0 to n times)
The syntax of PEBNF can be expressed in itself:
Production = production_name [ Parameters ] "=" Expression .
Parameters = "<" production_name { "," production_name } ">" .
Expression = Alternative { "|" Alternative } .
Alternative = Term { Term } .
Term = production_name [ Arguments ] | token [ "..." token ] | Group | Option | Repetition .
Arguments = "<" Expression { "," Expression } ">" .
Group = "(" Expression ")" .
Option = "[" Expression ")" .
Repetition = "{" Expression "}" .
Lower-case production names are used to identify productions that cannot
be broken by white space or comments; they are usually tokens. Other
production names are in CamelCase.
Tokens (lexical symbols) are enclosed in double quotes '''' (the
double quote symbol is written as ''"'').
The form "a ... b" represents the set of characters from "a" through "b" as
alternatives.
Productions can be parameterized. To get the actual production the parameter is
substituted with the argument provided where the production name is used. For
instance, there are various forms of semicolon-separated lists in the grammar.
The parameterized production for such lists is:
List<P> = P { ";" P } [ ";" ] .
In this case, P stands for the actual list element.
Where possible, recursive productions are used to express evaluation order
and operator precedence syntactically (for instance for expressions).
A production may be referenced from various places in this document
but is usually defined close to its first use. Productions and code
examples are indented.
Source code representation
----
Source code is Unicode text encoded in UTF-8.
Tokenization follows the usual rules. Source text is case-sensitive.
White space is blanks, newlines, carriage returns, or tabs.
Comments are // to end of line or /* */ without nesting and are treated as white space.
Some Unicode characters (e.g., the character U+00E4) may be representable in
two forms, as a single code point or as two code points. For simplicity of
implementation, Go treats these as distinct characters.
Characters
----
In the grammar the term
utf8_char
denotes an arbitrary Unicode code point encoded in UTF-8. Similarly,
non_ascii
denotes the subset of "utf8_char" code points with values >= 128.
Letters and digits
----
letter = "A" ... "Z" | "a" ... "z" | "_" | non_ascii.
decimal_digit = "0" ... "9" .
octal_digit = "0" ... "7" .
hex_digit = "0" ... "9" | "A" ... "F" | "a" ... "f" .
All non-ASCII code points are considered letters; digits are always ASCII.
Vocabulary
----
Tokens make up the vocabulary of the Go language. They consist of
identifiers, numbers, strings, operators, and delimitors.
Identifiers
----
An identifier is a name for a program entity such as a variable, a
type, a function, etc.
identifier = letter { letter | decimal_digit } .
a
_x
ThisIsVariable9
αβ
Some identifiers are predeclared (§Declarations).
Numeric literals
----
An integer literal represents a mathematically ideal integer constant
of arbitrary precision, or 'ideal int'.
int_lit = decimal_int | octal_int | hex_int .
decimal_int = ( "1" ... "9" ) { decimal_digit } .
octal_int = "0" { octal_digit } .
hex_int = "0" ( "x" | "X" ) hex_digit { hex_digit } .
42
0600
0xBadFace
170141183460469231731687303715884105727
A floating point literal represents a mathematically ideal floating point
constant of arbitrary precision, or 'ideal float'.
float_lit =
decimals "." [ decimals ] [ exponent ] |
decimals exponent |
"." decimals [ exponent ] .
decimals = decimal_digit { decimal_digit } .
exponent = ( "e" | "E" ) [ "+" | "-" ] decimals .
0.
2.71828
1.e+0
6.67428e-11
1E6
.25
.12345E+5
Numeric literals are unsigned. A negative constant is formed by
applying the unary prefix operator "-" (§Arithmetic operators).
An 'ideal number' is either an 'ideal int' or an 'ideal float'.
Only when an ideal number (or an arithmetic expression formed
solely from ideal numbers) is bound to a variable or used in an expression
or constant of fixed-size integers or floats it is required to fit
a particular size. In other words, ideal numbers and arithmetic
upon them are not subject to overflow; only use of them in assignments
or expressions involving fixed-size numbers may cause overflow, and thus
an error (§Expressions).
Implementation restriction: A compiler may implement ideal numbers
by choosing a "sufficiently large" internal representation of such
numbers.
Character and string literals
----
Character and string literals are almost the same as in C, with the
following differences:
- The encoding is UTF-8
- `` strings exist; they do not interpret backslashes
- Octal character escapes are always 3 digits ("\077" not "\77")
- Hexadecimal character escapes are always 2 digits ("\x07" not "\x7")
The rules are:
char_lit = "'" ( unicode_value | byte_value ) "'" .
unicode_value = utf8_char | little_u_value | big_u_value | escaped_char .
byte_value = octal_byte_value | hex_byte_value .
octal_byte_value = "\" octal_digit octal_digit octal_digit .
hex_byte_value = "\" "x" hex_digit hex_digit .
little_u_value = "\" "u" hex_digit hex_digit hex_digit hex_digit .
big_u_value =
"\" "U" hex_digit hex_digit hex_digit hex_digit
hex_digit hex_digit hex_digit hex_digit .
escaped_char = "\" ( "a" | "b" | "f" | "n" | "r" | "t" | "v" | "\" | "'" | """ ) .
A unicode_value takes one of four forms:
* The UTF-8 encoding of a Unicode code point. Since Go source
text is in UTF-8, this is the obvious translation from input
text into Unicode characters.
* The usual list of C backslash escapes: "\n", "\t", etc.
Within a character or string literal, only the corresponding quote character
is a legal escape (this is not explicitly reflected in the above syntax).
* A `little u' value, such as "\u12AB". This represents the Unicode
code point with the corresponding hexadecimal value. It always
has exactly 4 hexadecimal digits.
* A `big U' value, such as "\U00101234". This represents the
Unicode code point with the corresponding hexadecimal value.
It always has exactly 8 hexadecimal digits.
Some values that can be represented this way are illegal because they
are not valid Unicode code points. These include values above
0x10FFFF and surrogate halves.
An octal_byte_value contains three octal digits. A hex_byte_value
contains two hexadecimal digits. (Note: This differs from C but is
simpler.)
It is erroneous for an octal_byte_value to represent a value larger than 255.
(By construction, a hex_byte_value cannot.)
A character literal is a form of unsigned integer constant. Its value
is that of the Unicode code point represented by the text between the
quotes.
'a'
'ä'
'本'
'\t'
'\000'
'\007'
'\377'
'\x07'
'\xff'
'\u12e4'
'\U00101234'
String literals come in two forms: double-quoted and back-quoted.
Double-quoted strings have the usual properties; back-quoted strings
do not interpret backslashes at all.
string_lit = raw_string_lit | interpreted_string_lit .
raw_string_lit = "`" { utf8_char } "`" .
interpreted_string_lit = """ { unicode_value | byte_value } """ .
A string literal has type "string" (§Strings). Its value is constructed
by taking the byte values formed by the successive elements of the
literal. For byte_values, these are the literal bytes; for
unicode_values, these are the bytes of the UTF-8 encoding of the
corresponding Unicode code points. Note that
"\u00FF"
and
"\xFF"
are
different strings: the first contains the two-byte UTF-8 expansion of
the value 255, while the second contains a single byte of value 255.
The same rules apply to raw string literals, except the contents are
uninterpreted UTF-8.
`abc`
`\n`
"hello, world\n"
"\n"
""
"Hello, world!\n"
"日本語"
"\u65e5本\U00008a9e"
"\xff\u00FF"
These examples all represent the same string:
"日本語" // UTF-8 input text
`日本語` // UTF-8 input text as a raw literal
"\u65e5\u672c\u8a9e" // The explicit Unicode code points
"\U000065e5\U0000672c\U00008a9e" // The explicit Unicode code points
"\xe6\x97\xa5\xe6\x9c\xac\xe8\xaa\x9e" // The explicit UTF-8 bytes
Adjacent strings separated only by whitespace (including comments)
are concatenated into a single string. The following two lines
represent the same string:
"Alea iacta est."
"Alea" /* The die */ `iacta est` /* is cast */ "."
The language does not canonicalize Unicode text or evaluate combining
forms. The text of source code is passed uninterpreted.
If the source code represents a character as two code points, such as
a combining form involving an accent and a letter, the result will be
an error if placed in a character literal (it is not a single code
point), and will appear as two code points if placed in a string
literal.
Operators and delimitors
----
The following special character sequences serve as operators or delimitors:
+ & += &= && == != ( )
- | -= |= || < <= [ ]
* ^ *= ^= <- > >= { }
/ << /= <<= ++ = := , ;
% >> %= >>= -- ! ... . :
Reserved words
----
The following words are reserved and must not be used as identifiers:
break default func interface select
case else go map struct
chan export goto package switch
const fallthrough if range type
continue for import return var
Declarations and scope rules
----
A declaration ``binds'' an identifier to a language entity (such as
a package, constant, type, struct field, variable, parameter, result,
function, method) and specifies properties of that entity such as its type.
Declaration =
[ "export" | "package" ]
( ConstDecl | TypeDecl | VarDecl | FunctionDecl | MethodDecl ) .
Except for function, method and abbreviated variable declarations (using ":="),
all declarations follow the same pattern. There is either a single declaration
of the form P, or an optional semicolon-separated list of declarations of the
form P surrounded by parentheses:
Decl<P> = P | "(" [ List<P> ] ")" .
List<P> = P { ";" P } [ ";" ] .
Every identifier in a program must be declared; some identifiers, such as "int"
and "true", are predeclared (§Predeclared identifiers).
The ``scope'' of an identifier is the extent of source text within which the
identifier denotes the bound entity. No identifier may be declared twice in a
single scope. Go is lexically scoped: An identifier denotes the entity it is
bound to only within the scope of the identifier.
For instance, for a variable named "x", the scope of identifier "x" is the
extent of source text within which "x" denotes that particular variable.
It is illegal to declare another identifier "x" within the same scope.
The scope of an identifier depends on the entity declared. The scope for
an identifier always excludes scopes redeclaring the identifier in nested
blocks. An identifier declared in a nested block is said to ``shadow'' the
same identifier declared in an outer block.
1. The scope of predeclared identifiers is the entire source file.
2. The scope of an identifier denoting a type, function or package
extends textually from the point of the identifier in the declaration
to the end of the innermost surrounding block.
3. The scope of a constant or variable extends textually from
after the declaration to the end of the innermost surrounding
block.
4. The scope of a parameter or result identifier is the body of the
corresponding function.
5. The scope of a field or method identifier is selectors for the
corresponding type containing the field or method (§Selectors).
6. The scope of a label is the body of the innermost surrounding
function and does not intersect with any non-label scope. Thus,
each function has its own private label scope.
An entity is said to be ``local'' to its scope. Declarations in the package
scope are ``global'' declarations.
Predeclared identifiers
----
The following identifiers are predeclared:
All basic types:
bool, byte, uint8, uint16, uint32, uint64, int8, int16, int32, int64,
float32, float64, float80, string
A set of platform-specific convenience types:
uint, int, float, uintptr
The predeclared constants:
true, false, iota, nil
The predeclared functions (note: this list is likely to change):
cap(), convert(), len(), new(), panic(), panicln(), print(), println(), typeof(), ...
Exported declarations
----
Global declarations optionally may be marked for ``export'', thus making the
declared identifier accessible outside the current source file. Another source
file may then import the package (§Packages) and access exported identifiers
via qualified identifiers (§Qualified identifiers). Local declarations can
never be marked for export.
There are two kinds of exports: If a declaration in a package P is marked with
the keyword "export", the declared identifier is accessible in any file
importing P; this is called ``unrestricted export''. If a declaration is
marked with the keyword "package", the declared identifier is only accessible
in files belonging to the same package P; this is called ``package-restricted''
export.
If the identifier represents a type, it must be a complete type (§Types) and
the type structure is exported as well. In particular, if the declaration
defines a "struct" or "interface" type, all structure fields and all structure
and interface methods are exported also.
export const pi float = 3.14159265
export func Parse(source string);
package type Node *struct { val int; next *Node }
TODO: Eventually we need to be able to restrict visibility of fields and methods.
(gri) The default should be no struct fields and methods are automatically exported.
Export should be identifier-based: an identifier is either exported or not, and thus
visible or not in importing package.
Const declarations
----
A constant declaration binds an identifier to the value of a constant
expression (§Constant expressions).
ConstDecl = "const" Decl<ConstSpec> .
ConstSpec = identifier [ CompleteType ] [ "=" Expression ] .
const pi float = 3.14159265
const e = 2.718281828
const (
one int = 1;
two = 3
)
The constant expression may be omitted, in which case the expression is
the last expression used after the reserved word "const". If no such expression
exists, the constant expression cannot be omitted.
Together with the "iota" constant generator (§Iota),
implicit repetition permits light-weight declaration of enumerated
values:
const (
Sunday = iota;
Monday;
Tuesday;
Wednesday;
Thursday;
Friday;
Partyday;
)
The initializing expression of a constant may contain only other
constants. This is illegal:
var i int = 10;
const c = i; // error
The initializing expression for a numeric constant is evaluated
using the principles described in the section on numeric literals:
constants are mathematical values given a size only upon assignment
to a variable. Intermediate values, and the constants themselves,
may require precision significantly larger than any concrete type
in the language. Thus the following is legal:
const Huge = 1 << 100;
var Four int8 = Huge >> 98;
A given numeric constant expression is, however, defined to be
either an integer or a floating point value, depending on the syntax
of the literals it comprises (123 vs. 1.0e4). This is because the
nature of the arithmetic operations depends on the type of the
values; for example, 3/2 is an integer division yielding 1, while
3./2. is a floating point division yielding 1.5. Thus
const x = 3./2. + 3/2;
yields a floating point constant of value 2.5 (1.5 + 1); its
constituent expressions are evaluated using different rules for
division.
If the type is specified, the resulting constant has the named type.
If the type is missing from the constant declaration, the constant
represents a value of abitrary precision, either integer or floating
point, determined by the type of the initializing expression. Such
a constant may be assigned to any variable that can represent its
value accurately, regardless of type. For instance, 3 can be
assigned to any int variable but also to any floating point variable,
while 1e12 can be assigned to a float32, float64, or even int64.
It is erroneous to assign a value with a non-zero fractional
part to an integer, or if the assignment would overflow or
underflow.
Type declarations
----
A type declaration specifies a new type and binds an identifier to it.
The identifier is called the ``type name''; it denotes the type.
TypeDecl = "type" Decl<TypeSpec> .
TypeSpec = identifier Type .
A struct or interface type may be forward-declared (§Struct types,
§Interface types). A forward-declared type is incomplete (§Types)
until it is fully declared. The full declaration must must follow
within the same block containing the forward declaration.
type IntArray [16] int
type (
Point struct { x, y float };
Polar Point
)
type TreeNode struct {
left, right *TreeNode;
value Point;
}
type Comparable interface {
cmp(Comparable) int
}
Variable declarations
----
A variable declaration creates a variable, binds an identifier to it and
gives it a type. It may optionally give the variable an initial value.
The variable type must be a complete type (§Types).
In some forms of declaration the type of the initial value defines the type
of the variable.
VarDecl = "var" Decl<VarSpec> .
VarSpec = IdentifierList ( CompleteType [ "=" ExpressionList ] | "=" ExpressionList ) .
IdentifierList = identifier { "," identifier } .
ExpressionList = Expression { "," Expression } .
var i int
var u, v, w float
var k = 0
var x, y float = -1.0, -2.0
var (
i int;
u, v = 2.0, 3.0
)
If the expression list is present, it must have the same number of elements
as there are variables in the variable specification.
If the variable type is omitted, an initialization expression (or expression
list) must be present, and the variable type is the type of the expression
value (in case of a list of variables, the variables assume the types of the
corresponding expression values).
If the variable type is omitted, and the corresponding initialization expression
is a constant expression of abstract int or floating point type, the type
of the variable is "int" or "float" respectively:
var i = 0 // i has int type
var f = 3.1415 // f has float type
The syntax
SimpleVarDecl = identifier ":=" Expression .
is shorthand for
var identifier = Expression.
i := 0
f := func() int { return 7; }
ch := new(chan int);
Also, in some contexts such as "if", "for", or "switch" statements,
this construct can be used to declare local temporary variables.
Export declarations
----
TODO:
1) rephrase this section (much of it covered by Exported declarations)
2) rethink need for this kind of export
Global identifiers may be exported, thus making the
exported identifier visible outside the package. Another package may
then import the identifier to use it.
Export declarations must only appear at the global level of a
source file and can name only globally-visible identifiers.
That is, one can export global functions, types, and so on but not
local variables or structure fields.
Exporting an identifier makes the identifier visible externally to the
package. If the identifier represents a type, it must be a complete
type (§Types) and the type structure is
exported as well. The exported identifiers may appear later in the
source than the export directive itself, but it is an error to specify
an identifier not declared anywhere in the source file containing the
export directive.
ExportDecl = [ "package" ] "export" ExportIdentifier { "," ExportIdentifier } .
ExportIdentifier = QualifiedIdent .
export sin, cos
export math.abs
Types
----
A type specifies the set of values that variables of that type may assume
and the operators that are applicable.
A type may be specified by a type name (§Type declarations)
or a type literal.
Type = TypeName | TypeLit .
TypeName = QualifiedIdent.
TypeLit =
ArrayType | StructType | PointerType | FunctionType |
ChannelType | MapType | InterfaceType .
There are basic types and composite types. Basic types are predeclared and
denoted by their type names.
Composite types are arrays, maps, channels, structures, functions, pointers,
and interfaces. They are constructed from other (basic or composite) types
and denoted by their type names or by type literals.
Types may be ``complete'' or ''incomplete''. Basic, pointer, function and
interface types are always complete (although their components, such
as the base type of a pointer type, may be incomplete). All other types are
complete when they are fully declared. Incomplete types are subject to
usage restrictions; for instance the type of a variable must be complete
where the variable is declared.
CompleteType = Type .
The ``interface'' of a type is the set of methods bound to it
(§Method declarations). The interface of a pointer type is the interface
of the pointer base type (§Pointer types). All types have an interface;
if they have no methods associated with them, their interface is
called the ``empty'' interface.
TODO: Since methods are added one at a time, the interface of a type may
be different at different points in the source text. Thus, static checking
may give different results then dynamic checking which is problematic.
Need to resolve.
The ``static type'' (or simply ``type'') of a variable is the type defined by
the variable's declaration. The ``dynamic type'' of a variable is the actual
type of the value stored in a variable at run-time. Except for variables of
interface type, the dynamic type of a variable is always its static type.
Variables of interface type may hold values with different dynamic types
during execution. However, its dynamic type is always compatible with
the static type of the interface variable (§Interface types).
Basic types
----
Go defines a number of basic types, referred to by their predeclared
type names. These include traditional arithmetic types, booleans,
and strings.
Arithmetic types
----
The following list enumerates all platform-independent numeric types:
byte same as uint8 (for convenience)
uint8 the set of all unsigned 8-bit integers
uint16 the set of all unsigned 16-bit integers
uint32 the set of all unsigned 32-bit integers
uint64 the set of all unsigned 64-bit integers
int8 the set of all signed 8-bit integers, in 2's complement
int16 the set of all signed 16-bit integers, in 2's complement
int32 the set of all signed 32-bit integers, in 2's complement
int64 the set of all signed 64-bit integers, in 2's complement
float32 the set of all valid IEEE-754 32-bit floating point numbers
float64 the set of all valid IEEE-754 64-bit floating point numbers
float80 the set of all valid IEEE-754 80-bit floating point numbers
Additionally, Go declares a set of platform-specific numeric types for
convenience:
uint at least 32 bits, at most the size of the largest uint type
int at least 32 bits, at most the size of the largest int type
float at least 32 bits, at most the size of the largest float type
uintptr smallest uint type large enough to store the uninterpreted
bits of a pointer value
For instance, int might have the same size as int32 on a 32-bit
architecture, or int64 on a 64-bit architecture.
Except for byte, which is an alias for uint8, all numeric types
are different from each other to avoid portability issues. Conversions
are required when different numeric types are mixed in an expression or assignment.
For instance, int32 and int are not the same type even though they may have
the same size on a particular platform.
Booleans
----
The type "bool" comprises the truth values true and false, which are
available through the two predeclared constants, "true" and "false".
Strings
----
The string type represents the set of string values (strings).
Strings behave like arrays of bytes, with the following properties:
- They are immutable: after creation, it is not possible to change the
contents of a string.
- No internal pointers: it is illegal to create a pointer to an inner
element of a string.
- They can be indexed: given string "s1", "s1[i]" is a byte value.
- They can be concatenated: given strings "s1" and "s2", "s1 + s2" is a value
combining the elements of "s1" and "s2" in sequence.
- Known length: the length of a string "s1" can be obtained by calling
"len(s1)". The length of a string is the number
of bytes within. Unlike in C, there is no terminal NUL byte.
- Creation 1: a string can be created from an integer value by a conversion;
the result is a string containing the UTF-8 encoding of that code point
(§Conversions).
"string('x')" yields "x"; "string(0x1234)" yields the equivalent of "\u1234"
- Creation 2: a string can by created from an array of integer values (maybe
just array of bytes) by a conversion (§Conversions):
a [3]byte; a[0] = 'a'; a[1] = 'b'; a[2] = 'c'; string(a) == "abc";
Array types
----
An array is a composite type consisting of a number of elements all of the same
type, called the element type. The number of elements of an array is called its
length; it is always positive (including zero). The elements of an array are
designated by indices which are integers between 0 and the length - 1.
An array type specifies the array element type and an optional array
length which must be a compile-time constant expression of a (signed or
unsigned) int type. If present, the array length and its value is part of
the array type. The element type must be a complete type (§Types).
If the length is present in the declaration, the array is called
``fixed array''; if the length is absent, the array is called ``open array''.
ArrayType = "[" [ ArrayLength ] "]" ElementType .
ArrayLength = Expression .
ElementType = CompleteType .
The length of an array "a" can be discovered using the built-in function
len(a)
If "a" is a fixed array, the length is known at compile-time and "len(a)" can
be evaluated to a compile-time constant. If "a" is an open array, then "len(a)"
will only be known at run-time.
The amount of space actually allocated to hold the array data may be larger
then the current array length; this maximum array length is called the array
capacity. The capacity of an array "a" can be discovered using the built-in
function
cap(a)
and the following relationship between "len()" and "cap()" holds:
0 <= len(a) <= cap(a)
Allocation: An open array may only be used as a function parameter type, or
as element type of a pointer type. There are no other variables
(besides parameters), struct or map fields of open array type; they must be
pointers to open arrays. For instance, an open array may have a fixed array
element type, but a fixed array must not have an open array element type
(though it may have a pointer to an open array). Thus, for now, there are
only ``one-dimensional'' open arrays.
The following are legal array types:
[32] byte
[2*N] struct { x, y int32 }
[1000]*[] float64
[] int
[][1024] byte
Variables of fixed arrays may be declared statically:
var a [32] byte
var m [1000]*[] float64
Static and dynamic arrays may be allocated dynamically via the built-in function
"new()" which takes an array type and zero or one array lengths as parameters,
depending on the number of open arrays in the type:
new([32] byte) // *[32] byte
new([]int, 100); // *[100] int
new([][1024] byte, 4); // *[4][1024] byte
Assignment compatibility: Fixed arrays are assignment compatible to variables
of the same type, or to open arrays with the same element type. Open arrays
may only be assigned to other open arrays with the same element type.
For the variables:
var fa, fb [32] int
var fc [64] int
var pa, pb *[] int
var pc *[][32] int
the following assignments are legal, and cause the respective array elements
to be copied:
fa = fb;
pa = pb;
*pa = *pb;
fa = *pc[7];
*pa = fa;
*pb = fc;
*pa = *pc[11];
The following assignments are illegal:
fa = *pa; // cannot assign open array to fixed array
*pc[7] = *pa; // cannot assign open array to fixed array
fa = fc; // different fixed array types
*pa = *pc; // different element types of open arrays
Array indexing: Given a (pointer to an) array variable "a", an array element
is specified with an array index operation:
a[i]
This selects the array element at index "i". "i" must be within array bounds,
that is "0 <= i < len(a)".
Array slicing: Given a (pointer to an) array variable "a", a sub-array is
specified with an array slice operation:
a[i : j]
This selects the sub-array consisting of the elements "a[i]" through "a[j - 1]"
(exclusive "a[j]"). "i" must be within array bounds, and "j" must satisfy
"i <= j <= cap(a)". The length of the new slice is "j - i". The capacity of
the slice is "cap(a) - i"; thus if "i" is 0, the array capacity does not change
as a result of a slice operation. An array slice is always an open array.
Note that a slice operation does not ``crop'' the underlying array, it only
provides a new ``view'' to an array. If the capacity of an array is larger
then its length, slicing can be used to ``grow'' an array:
// allocate an open array of bytes with length i and capacity 100
i := 10;
a := new([] byte, 100) [0 : i];
// grow the array by n bytes, with i + n <= 100
a = a[0 : i + n];
TODO: Expand on details of slicing and assignment, especially between pointers
to arrays and arrays.
Struct types
----
A struct is a composite type consisting of a fixed number of elements,
called fields, with possibly different types. A struct type declares
an identifier and type for each field. Within a struct type no field
identifier may be declared twice and all field types must be complete
types (§Types).
StructType = "struct" [ "{" [ List<FieldDecl> ] "}" ] .
FieldDecl = (IdentifierList CompleteType | TypeName) [ Tag ] .
Tag = string_lit .
// An empty struct.
struct {}
// A struct with 5 fields.
struct {
x, y int;
u float;
a *[]int;
f *();
}
A struct may contain ``anonymous fields'', which are declared with a type
but no explicit field identifier. An anonymous field type must be specified as
a type name "T", or as a pointer to a type name ``*T'', and T itself may not be
a pointer or interface type. The unqualified type acts as the field identifier.
// A struct with four anonymous fields of type T1, *T2, P.T3 and *P.T4
struct {
T1; // the field name is T1
*T2; // the field name is T2
P.T3; // the field name is the unqualified type name T3
*P.T4; // the field name is the unqualified type name T4
x, y int;
}
The unqualified type name of an anonymous field must not conflict with the
field identifier (or unqualified type name for an anonymous field) of any
other field within the struct. The following declaration is illegal:
struct {
T; // conflicts with anonymous field *T and *P.T
*T; // conflicts with anonymous field T and *P.T
*P.T; // conflicts with anonymous field T and *T
}
Fields and methods (§Method declarations) of an anonymous field become directly
accessible as fields and methods of the struct without the need to provide the
type name of the respective anonymous field (§Selectors).
A field declaration may be followed by an optional string literal tag which
becomes an ``attribute'' for all the identifiers in the corresponding
field declaration. The tags are available via the reflection library but
are ignored otherwise. A tag may contain arbitrary application-specific
information.
// A struct corresponding to the EventIdMessage protocol buffer.
// The tag strings contain the protocol buffer field tags.
struct {
time_usec uint64 "1";
server_ip uint32 "2";
process_id uint32 "3";
}
Forward declaration:
A struct type consisting of only the reserved word "struct" may be used in
a type declaration; it declares an incomplete struct type (§Type declarations).
This allows the construction of mutually recursive types such as:
type S2 struct // forward declaration of S2
type S1 struct { s2 *S2 }
type S2 struct { s1 *S1 }
Assignment compatibility: Structs are assignment compatible to variables of
equal type only.
Pointer types
----
A pointer type denotes the set of all pointers to variables of a given
type, called the ``base type'' of the pointer, and the value "nil".
PointerType = "*" BaseType .
BaseType = Type .
*int
*map[string] *chan
The pointer base type may be denoted by an identifier referring to an
incomplete type (§Types), possibly declared via a forward declaration.
This allows the construction of recursive and mutually recursive types
such as:
type S struct { s *S }
type S2 struct // forward declaration of S2
type S1 struct { s2 *S2 }
type S2 struct { s1 *S1 }
Assignment compatibility: A pointer is assignment compatible to a variable
of pointer type, only if both types are equal.
Pointer arithmetic of any kind is not permitted.
Map types
----
A map is a composite type consisting of a variable number of entries
called (key, value) pairs. For a given map, the keys and values must
each be of a specific complete type (§Types) called the key and value type,
respectively. Upon creation, a map is empty and values may be added and removed
during execution. The number of entries in a map is called its length.
MapType = "map" "[" KeyType "]" ValueType .
KeyType = CompleteType .
ValueType = CompleteType .
map [string] int
map [struct { pid int; name string }] *chan Buffer
map [string] any
The length of a map "m" can be discovered using the built-in function
len(m)
Allocation: A map may only be used as a base type of a pointer type.
There are no variables, parameters, array, struct, or map fields of
map type, only of pointers to maps.
Assignment compatibility: A pointer to a map type is assignment
compatible to a variable of pointer to map type only if both types
are equal.
Channel types
----
A channel provides a mechanism for two concurrently executing functions
to synchronize execution and exchange values of a specified type. This
type must be a complete type (§Types).
Upon creation, a channel can be used both to send and to receive.
By conversion or assignment, a channel may be constrained only to send or
to receive. This constraint is called a channel's ``direction''; either
bi-directional (unconstrained), send, or receive.
ChannelType = Channel | SendChannel | RecvChannel .
Channel = "chan" ValueType .
SendChannel = "chan" "<-" ValueType .
RecvChannel = "<-" "chan" ValueType .
chan T // can send and receive values of type T
chan <- float // can only be used to send floats
<-chan int // can receive only ints
Channel variables always have type pointer to channel.
It is an error to attempt to use a channel value and in
particular to dereference a channel pointer.
var ch *chan int;
ch = new(chan int); // new returns type *chan int
TODO(gri): Do we need the channel conversion? It's enough to just keep
the assignment rule.
Function types
----
A function type denotes the set of all functions with the same parameter
and result types.
FunctionType = "(" [ ParameterList ] ")" [ Result ] .
ParameterList = ParameterDecl { "," ParameterDecl } .
ParameterDecl = [ IdentifierList ] ( Type | "..." ) .
Result = Type | "(" ParameterList ")" .
In ParameterList, the parameter names (IdentifierList) either must all be
present, or all be absent. If the parameters are named, each name stands
for one parameter of the specified type. If the parameters are unnamed, each
type stands for one parameter of that type.
For the last incoming parameter only, instead of a parameter type one
may write "...". The ellipsis indicates that the last parameter stands
for an arbitrary number of additional arguments of any type (including
no additional arguments). If the parameters are named, the identifier
list immediately preceding "..." must contain only one identifier (the
name of the last parameter).
()
(x int)
() int
(string, float, ...)
(a, b int, z float) bool
(a, b int, z float) (bool)
(a, b int, z float, opt ...) (success bool)
(int, int, float) (float, *[]int)
A variable can hold only a pointer to a function, not a function value.
In particular, v := func() {} creates a variable of type *(). To call the
function referenced by v, one writes v(). It is illegal to dereference a
function pointer.
Assignment compatibility: A function pointer can be assigned to a function
(pointer) variable only if both function types are equal.
Interface types
----
Type interfaces may be specified explicitly by interface types.
An interface type denotes the set of all types that implement at least
the set of methods specified by the interface type, and the value "nil".
InterfaceType = "interface" [ "{" [ List<MethodSpec> ] "}" ] .
MethodSpec = IdentifierList FunctionType .
// A basic file interface.
interface {
Read, Write (b Buffer) bool;
Close ();
}
Any type (including interface types) whose interface has, possibly as a
subset, the complete set of methods of an interface I is said to implement
interface I. For instance, if two types S1 and S2 have the methods
func (p T) Read(b Buffer) bool { return ... }
func (p T) Write(b Buffer) bool { return ... }
func (p T) Close() { ... }
(where T stands for either S1 or S2) then the File interface is
implemented by both S1 and S2, regardless of what other methods
S1 and S2 may have or share.
All types implement the empty interface:
interface {}
In general, a type implements an arbitrary number of interfaces.
For instance, consider the interface
type Lock interface {
lock, unlock ();
}
If S1 and S2 also implement
func (p T) lock() { ... }
func (p T) unlock() { ... }
they implement the Lock interface as well as the File interface.
Forward declaration:
A interface type consisting of only the reserved word "interface" may be used in
a type declaration; it declares an incomplete interface type (§Type declarations).
This allows the construction of mutually recursive types such as:
type T2 interface
type T1 interface {
foo(T2) int;
}
type T2 interface {
bar(T1) int;
}
Assignment compatibility: A value can be assigned to an interface variable
if the static type of the value implements the interface or if the value is "nil".
Type equality
----
Types may be ``different'', ``structurally equal'', or ``identical''.
Go is a type-safe language; generally different types cannot be mixed
in binary operations, and values cannot be assigned to variables of different
types. However, values may be assigned to variables of structually
equal types. Finally, type guards succeed only if the dynamic type
is identical to or implements the type tested against (§Type guards).
Structural type equality (equality for short) is defined by these rules:
Two type names denote equal types if the types in the corresponding declarations
are equal. Two type literals specify equal types if they have the same
literal structure and corresponding components have equal types. Loosely
speaking, two types are equal if their values have the same layout in memory.
More precisely:
- Two array types are equal if they have equal element types and if they
are either fixed arrays with the same array length, or they are open
arrays.
- Two struct types are equal if they have the same number of fields in the
same order, corresponding fields are either both named or both anonymous,
and corresponding field types are equal. Note that field names
do not have to match.
- Two pointer types are equal if they have equal base types.
- Two function types are equal if they have the same number of parameters
and result values and if corresponding parameter and result types are
equal (a "..." parameter is equal to another "..." parameter).
Note that parameter and result names do not have to match.
- Two channel types are equal if they have equal value types and
the same direction.
- Two map types are equal if they have equal key and value types.
- Two interface types are equal if they have the same set of methods
with the same names and equal function types. Note that the order
of the methods in the respective type declarations is irrelevant.
Type identity is defined by these rules:
Two type names denote identical types if they originate in the same
type declaration. Two type literals specify identical types if they have the
same literal structure and corresponding components have identical types.
More precisely:
- Two array types are identical if they have identical element types and if
they are either fixed arrays with the same array length, or they are open
arrays.
- Two struct types are identical if they have the same number of fields in
the same order, corresponding fields either have both the same name or
are both anonymous, and corresponding field types are identical.
- Two pointer types are identical if they have identical base types.
- Two function types are identical if they have the same number of
parameters and result values both with the same (or absent) names, and
if corresponding parameter and result types are identical (a "..."
parameter is identical to another "..." parameter with the same name).
- Two channel types are identical if they have identical value types and
the same direction.
- Two map types are identical if they have identical key and value types.
- Two interface types are identical if they have the same set of methods
with the same names and identical function types. Note that the order
of the methods in the respective type declarations is irrelevant.
Note that the type denoted by a type name is identical only to the type literal
in the type name's declaration.
Finally, two types are different if they are not structurally equal.
(By definition, they cannot be identical, either).
For instance, given the declarations
type (
T0 []string;
T1 []string
T2 struct { a, b int };
T3 struct { a, c int };
T4 *(int, float) *T0
T5 *(x int, y float) *[]string
)
these are some types that are equal
T0 and T0
T0 and []string
T2 and T3
T4 and T5
T3 and struct { a int; int }
and these are some types that are identical
T0 and T0
[]int and []int
struct { a, b *T5 } and struct { a, b *T5 }
As an example, "T0" and "T1" are equal but not identical because they have
different declarations.
Expressions
----
An expression specifies the computation of a value via the application of
operators and function invocations on operands. An expression has a value and
a type.
The type of a constant expression may be an ideal number. The type of such expressions
is implicitly converted into the 'expected numeric type' required for the expression.
The conversion is legal if the (ideal) expression value is a member of the
set represented by the expected numeric type. In all other cases, and specifically
if the expected type is not a numeric type, the expression is erroneous.
For instance, if the expected numeric type is a uint32, any ideal number
which fits into a uint32 without loss of precision can be legally converted.
Thus, the values 991, 42.0, and 1e9 are ok, but -1, 3.14, or 1e100 are not.
<!--
TODO(gri) This may be overly constraining. What about "len(a) + c" where
c is an ideal number? Is len(a) of type int, or of an ideal number? Probably
should be ideal number, because for fixed arrays, it is a constant.
-->
If an exceptional condition occurs during the evaluation of an expression
(that is, if the result is not mathematically defined or not in the range
of representable values for its type), the behavior is undefined. For
instance, the behavior of integer under- or overflow is not defined.
Operands
----
Operands denote the elementary values in an expression.
Operand = Literal | QualifiedIdent | "(" Expression ")" .
Literal = BasicLit | CompositeLit | FunctionLit .
BasicLit = int_lit | float_lit | char_lit | string_lit .
Constants
----
An operand is called ``constant'' if it is a literal of a basic type
(including the predeclared constants "true" and "false", and the values
denoted by "iota"), the predeclared constant "nil", or a parenthesized
constant expression (§Constant expressions). Constants have values that
are known at compile-time.
Qualified identifiers
----
A qualified identifier is an identifier qualified by a package name.
TODO(gri) expand this section.
QualifiedIdent = { PackageName "." } identifier .
PackageName = identifier .
Iota
----
Within a declaration, the predeclared operand "iota"
represents successive elements of an integer sequence.
It is reset to zero whenever the reserved word "const"
introduces a new declaration and increments as each identifier
is declared. For instance, "iota" can be used to construct
a set of related constants:
const (
enum0 = iota; // sets enum0 to 0, etc.
enum1 = iota;
enum2 = iota
)
const (
a = 1 << iota; // sets a to 1 (iota has been reset)
b = 1 << iota; // sets b to 2
c = 1 << iota; // sets c to 4
)
const x = iota; // sets x to 0
const y = iota; // sets y to 0
Since the expression in constant declarations repeats implicitly
if omitted, the first two examples above can be abbreviated:
const (
enum0 = iota; // sets enum0 to 0, etc.
enum1;
enum2
)
const (
a = 1 << iota; // sets a to 1 (iota has been reset)
b; // sets b to 2
c; // sets c to 4
)
Composite Literals
----
Literals for composite data structures consist of the type of the value
followed by a braced expression list for array and structure literals,
or a list of expression pairs for map literals.
CompositeLit = LiteralType "{" [ ( ExpressionList | ExprPairList ) [ "," ] ] "}" .
LiteralType = TypeName | ArrayType | MapType | StructType .
ExprPairList = ExprPair { "," ExprPair } .
ExprPair = Expression ":" Expression .
If LiteralType is a TypeName, the denoted type must be an array, map, or
structure. The types of the expressions must match the respective key, element,
and field types of the literal type; there is no automatic type conversion.
Composite literals are values of the type specified by LiteralType; that is
a new value is created every time the literal is evaluated. To get
a pointer to the literal, the address operator "&" must be used.
Implementation restriction: Currently, map literals are pointers to maps.
Given
type Rat struct { num, den int };
type Num struct { r Rat; f float; s string };
one can write
pi := Num{Rat{22, 7}, 3.14159, "pi"};
Array literals are always fixed arrays: If no array length is specified in
LiteralType, the array length is the number of elements provided in the composite
literal. Otherwise the array length is the length specified in LiteralType.
In the latter case, fewer elements than the array length may be provided in the
literal, and the missing elements are set to the appropriate zero value for
the array element type. It is an error to provide more elements then specified
in LiteralType.
buffer := [10]string{}; // len(buffer) == 10
primes := [6]int{2, 3, 5, 7, 9, 11}; // len(primes) == 6
weekenddays := &[]string{"sat", "sun"}; // len(weekenddays) == 2
Map literals are similar except the elements of the expression list are
key-value pairs separated by a colon:
m := &map[string]int{"good": 0, "bad": 1, "indifferent": 7};
TODO: Consider adding helper syntax for nested composites
(avoids repeating types but complicates the spec needlessly.)
Function Literals
----
A function literal represents an anonymous function. It consists of a
specification of the function type and the function body. The parameter
and result types of the function type must all be complete types (§Types).
FunctionLit = "func" FunctionType Block .
Block = "{" [ StatementList ] "}" .
The type of a function literal is a pointer to the function type.
func (a, b int, z float) bool { return a*b < int(z); }
A function literal can be assigned to a variable of the
corresponding function pointer type, or invoked directly.
f := func(x, y int) int { return x + y; }
func(ch *chan int) { ch <- ACK; } (reply_chan)
Implementation restriction: A function literal can reference only
its parameters, global variables, and variables declared within the
function literal.
Primary expressions
----
PrimaryExpr =
Operand |
PrimaryExpr Selector |
PrimaryExpr Index |
PrimaryExpr Slice |
PrimaryExpr TypeGuard |
PrimaryExpr Call .
Selector = "." identifier .
Index = "[" Expression "]" .
Slice = "[" Expression ":" Expression "]" .
TypeGuard = "." "(" Type ")" .
Call = "(" [ ExpressionList ] ")" .
x
2
(s + ".txt")
f(3.1415, true)
Point(1, 2)
new([]int, 100)
m["foo"]
s[i : j + 1]
obj.color
Math.sin
f.p[i].x()
Selectors
----
A primary expression of the form
x.f
denotes the field or method f of the value denoted by x (or of *x if
x is of pointer type). The identifier f is called the (field or method)
``selector''.
A selector f may denote a field f declared in a type T, or it may refer
to a field f declared in a nested anonymous field of T. Analogously,
f may denote a method f of T, or it may refer to a method f of the type
of a nested anonymous field of T. The number of anonymous fields traversed
to get to the field or method is called its ``depth'' in T.
More precisely, the depth of a field or method f declared in T is zero.
The depth of a field or method f declared anywhere inside
an anonymous field A declared in T is the depth of f in A plus one.
The following rules apply to selectors:
1) For a value x of type T or *T where T is not an interface type,
x.f denotes the field or method at the shallowest depth in T where there
is such an f. The type of x.f is the type of the field or method f.
If there is not exactly one f with shallowest depth, the selector
expression is illegal.
2) For a variable x of type I or *I where I is an interface type,
x.f denotes the actual method with name f of the value assigned
to x if there is such a method. The type of x.f is the type
of the method f. If no value or nil was assigned to x, x.f is illegal.
3) In all other cases, x.f is illegal.
Thus, selectors automatically dereference pointers as necessary. For instance,
for an x of type *T where T declares an f, x.f is a shortcut for (*x).f.
Furthermore, for an x of type T containing an anonymous field A declared as *A
inside T, and where A contains a field f, x.f is a shortcut for (*x.A).f
(assuming that the selector is legal in the first place).
The following examples illustrate selector use in more detail. Given the
declarations:
type T0 struct {
x int;
}
func (recv *T0) M0()
type T1 struct {
y int;
}
func (recv T1) M1()
type T2 struct {
z int;
T1;
*T0;
}
func (recv *T2) M2()
var p *T2; // with p != nil and p.T1 != nil
one can write:
p.z // (*p).z
p.y // ((*p).T1).y
p.x // (*(*p).T0).x
p.M2 // (*p).M2
p.M1 // ((*p).T1).M1
p.M0 // ((*p).T0).M0
TODO: Specify what happens to receivers.
Indexes
----
A primary expression of the form
a[x]
denotes the array or map element x. The value x is called the
``array index'' or ``map key'', respectively. The following
rules apply:
For a of type A or *A where A is an array type (§Array types):
- x must be an integer value and 0 <= x < len(a)
- a[x] is the array element at index x and the type of a[x]
is the element type of A
For a of type *M, where M is a map type (§Map types):
- x must be of the same type as the key type of M
and the map must contain an entry with key x
- a[x] is the map value with key x and the type of a[x]
is the value type of M
Otherwise a[x] is illegal.
TODO: Need to expand map rules for assignments of the form v, ok = m[k].
Slices
----
Strings and arrays can be ``sliced'' to construct substrings or subarrays.
The index expressions in the slice select which elements appear in the
result. The result has indexes starting at 0 and length equal to the difference
in the index values in the slice. After
a := []int(1,2,3,4)
slice := a[1:3]
The array ``slice'' has length two and elements
slice[0] == 2
slice[1] == 3
The index values in the slice must be in bounds for the original
array (or string) and the slice length must be non-negative.
Slices are new arrays (or strings) storing copies of the elements, so
changes to the elements of the slice do not affect the original.
In the example, a subsequent assignment to element 0,
slice[0] = 5
would have no effect on ``a''.
Type guards
----
For an expression "x" and a type "T", the primary expression
x.(T)
asserts that the value stored in "x" is an element of type "T" (§Types).
The notation ".(T)" is called a ``type guard'', and "x.(T)" is called
a ``guarded expression''. The type of "x" must be an interface type.
More precisely, if "T" is not an interface type, the expression asserts
that the dynamic type of "x" is identical to the type "T" (§Types).
If "T" is an interface type, the expression asserts that the dynamic type
of T implements the interface "T" (§Interface types). Because it can be
verified statically, a type guard in which the static type of "x" implements
the interface "T" is illegal. The type guard is said to succeed if the
assertion holds.
If the type guard succeeds, the value of the guarded expression is the value
stored in "x" and its type is "T". If the type guard fails, a run-time
exception occurs. In other words, even though the dynamic type of "x"
is only known at run-time, the type of the guarded expression "x.(T)" is
known to be "T" in a correct program.
As a special form, if a guarded expression is used in an assignment
v, ok = x.(T)
v, ok := x.(T)
the result of the guarded expression is a pair of values with types "(T, bool)".
If the type guard succeeds, the expression returns the pair "(x.(T), true)";
that is, the value stored in "x" (of type "T") is assigned to "v", and "ok"
is set to true. If the type guard fails, the value in "v" is set to the initial
value for the type of "v" (§Program initialization and execution), and "ok" is
set to false. No run-time exception occurs in this case.
TODO add examples
Calls
----
Given a function pointer, one writes
p()
to call the function.
A method is called using the notation
receiver.method()
where receiver is a value of the receive type of the method.
For instance, given a *Point variable pt, one may call
pt.Scale(3.5)
The type of a method is the type of a function with the receiver as first
argument. For instance, the method "Scale" has type
(p *Point, factor float)
However, a function declared this way is not a method.
There is no distinct method type and there are no method literals.
Parameter passing
----
TODO expand this section (right now only "..." parameters are covered).
Inside a function, the type of the "..." parameter is the empty interface
"interface {}". The dynamic type of the parameter - that is, the type of
the value stored in the parameter - is of the form (in pseudo-
notation)
*struct {
arg(0) typeof(arg(0));
arg(1) typeof(arg(1));
arg(2) typeof(arg(2));
...
arg(n-1) typeof(arg(n-1));
}
where the "arg(i)"'s correspond to the actual arguments passed in place
of the "..." parameter (the parameter and type names are for illustration
only). Reflection code may be used to access the struct value and its fields.
Thus, arguments provided in place of a "..." parameter are wrapped into
a corresponding struct, and a pointer to the struct is passed to the
function instead of the actual arguments.
For instance, consider the function
func f(x int, s string, f_extra ...)
and the call
f(42, "foo", 3.14, true, &[]int{1, 2, 3})
Upon invocation, the parameters "3.14", "true", and "*[3]int{1, 2, 3}"
are wrapped into a struct and the pointer to the struct is passed to f.
In f the type of parameter "f_extra" is "interface{}".
The dynamic type of "f_extra" is the type of the value assigned
to it upon invocation (the field names "arg0", "arg1", "arg2" are made
up for illustration only, they are not accessible via reflection):
*struct {
arg0 float;
arg1 bool;
arg2 *[3]int;
}
The values of the fields "arg0", "arg1", and "arg2" are "3.14", "true",
and "*[3]int{1, 2, 3}".
As a special case, if a function passes a "..." parameter as the argument
for a "..." parameter of a function, the parameter is not wrapped again into
a struct. Instead it is passed along unchanged. For instance, the function
f may call a function g with declaration
func g(x int, g_extra ...)
as
g(x, f_extra);
Inside g, the value stored in g_extra is the same as the value stored
in f_extra.
Operators
----
Operators combine operands into expressions.
Expression = UnaryExpr | Expression binaryOp UnaryExpr .
UnaryExpr = PrimaryExpr | unary_op UnaryExpr .
binary_op = log_op | com_op | rel_op | add_op | mul_op .
log_op = "||" | "&&" .
com_op = "<-" .
rel_op = "==" | "!=" | "<" | "<=" | ">" | ">=" .
add_op = "+" | "-" | "|" | "^" .
mul_op = "*" | "/" | "%" | "<<" | ">>" | "&" .
unary_op = "+" | "-" | "!" | "^" | "*" | "&" | "<-" .
The operand types in binary operations must be equal, with the following exceptions:
- If one operand has numeric type and the other operand is
an ideal number, the ideal number is converted to match the type of
the other operand (§Expression).
- If both operands are ideal numbers, the conversion is to ideal floats
if one of the operands is an ideal float (relevant for "/" and "%").
- The right operand in a shift operation must be always be an unsigned int
(or an ideal number that can be safely converted into an unsigned int)
(§Arithmetic operators).
Unary operators have the highest precedence. They are evaluated from
right to left. Note that "++" and "--" are outside the unary operator
hierachy (they are statements) and they apply to the operand on the left.
Specifically, "*p++" means "(*p)++" in Go (as opposed to "*(p++)" in C).
There are six precedence levels for binary operators:
multiplication operators bind strongest, followed by addition
operators, comparison operators, communication operators,
"&&" (logical and), and finally "||" (logical or) with the
lowest precedence:
Precedence Operator
6 * / % << >> &
5 + - | ^
4 == != < <= > >=
3 <-
2 &&
1 ||
Binary operators of the same precedence associate from left to right.
For instance, "x / y / z" stands for "(x / y) / z".
Examples
+x
23 + 3*x[i]
x <= f()
^a >> b
f() || g()
x == y + 1 && <-chan_ptr > 0
Arithmetic operators
----
Arithmetic operators apply to numeric types and yield a result of the same
type as the first operand. The four standard arithmetic operators ("+", "-",
"*", "/") apply to both integer and floating point types, while "+" also applies
to strings and arrays; all other arithmetic operators apply to integer types only.
+ sum integers, floats, strings, arrays
- difference integers, floats
* product integers, floats
/ quotient integers, floats
% remainder integers
& bitwise and integers
| bitwise or integers
^ bitwise xor integers
<< left shift integer << unsigned integer
>> right shift integer >> unsigned integer
Strings and arrays can be concatenated using the "+" operator
(or via the "+=" assignment):
s := "hi" + string(c)
a += []int{5, 6, 7}
String and array addition creates a new array or string by copying the
elements.
For integer values, "/" and "%" satisfy the following relationship:
(a / b) * b + a % b == a
and
(a / b) is "truncated towards zero".
Examples:
x y x / y x % y
5 3 1 2
-5 3 -1 -2
5 -3 -1 2
-5 -3 1 -2
Note that if the dividend is positive and the divisor is a constant power of 2,
the division may be replaced by a left shift, and computing the remainder may
be replaced by a bitwise "and" operation:
x x / 4 x % 4 x >> 2 x & 3
11 2 3 2 3
-11 -2 -3 -3 1
The shift operators shift the left operand by the shift count specified by the
right operand. They implement arithmetic shifts if the left operand is a signed
integer, and logical shifts if it is an unsigned integer. The shift count must
be an unsigned integer. There is no upper limit on the shift count. It is
as if the left operand is shifted "n" times by 1 for a shift count of "n".
The unary operators "+", "-", and "^" are defined as follows:
+x is 0 + x
-x negation is 0 - x
^x bitwise complement is -1 ^ x
Comparison operators
----
Comparison operators yield a boolean result. All comparison operators apply
to strings and numeric types. The operators "==" and "!=" also apply to
boolean values, pointer and interface types (including the value "nil").
== equal
!= not equal
< less
<= less or equal
> greater
>= greater or equal
Strings are compared byte-wise (lexically).
Pointers are equal if they point to the same value.
Interfaces are equal if both their dynamic types and values are equal.
For a value "v" of interface type, "v == nil" is true only if the predeclared
constant "nil" is assigned explicitly to "v" (§Assignments), or "v" has not
been modified since creation (§Program initialization and execution).
TODO: Should we allow general comparison via interfaces? Problematic.
Logical operators
----
Logical operators apply to boolean operands and yield a boolean result.
The right operand is evaluated conditionally.
&& conditional and p && q is "if p then q else false"
|| conditional or p || q is "if p then true else q"
! not !p is "not p"
Address operators
----
TODO: Need to talk about unary "*", clean up section below.
Given a function f, declared as
func f(a int) int;
taking the address of f with the expression
&f
creates a pointer to the function that may be stored in a value of type pointer
to function:
var fp *(a int) int = &f;
The function pointer may be invoked with the usual syntax; no explicit
indirection is required:
fp(7)
Methods are a form of function, and the address of a method has the type
pointer to function. Consider the type T with method M:
type T struct {
a int;
}
func (tp *T) M(a int) int;
var t *T;
To construct the address of method M, one writes
&t.M
using the variable t (not the type T). The expression is a pointer to a
function, with type
*(t *T, a int) int
and may be invoked only as a function, not a method:
var f *(t *T, a int) int;
f = &t.M;
x := f(t, 7);
Note that one does not write t.f(7); taking the address of a method demotes
it to a function.
In general, given type T with method M and variable t of type *T,
the method invocation
t.M(args)
is equivalent to the function call
(&t.M)(t, args)
If T is an interface type, the expression &t.M does not determine which
underlying type's M is called until the point of the call itself. Thus given
T1 and T2, both implementing interface I with interface M, the sequence
var t1 *T1;
var t2 *T2;
var i I = t1;
m := &i.M;
m(t2);
will invoke t2.M() even though m was constructed with an expression involving
t1.
Communication operators
----
The syntax presented above covers communication operations. This
section describes their form and function.
Here the term "channel" means "variable of type *chan".
A channel is created by allocating it:
ch := new(chan int)
An optional argument to new() specifies a buffer size for an
asynchronous channel; if absent or zero, the channel is synchronous:
sync_chan := new(chan int)
buffered_chan := new(chan int, 10)
The send operation uses the binary operator "<-", which operates on
a channel and a value (expression):
ch <- 3
In this form, the send operation is an (expression) statement that
sends the value on the channel. Both the channel and the expression
are evaluated before communication begins. Communication blocks
until the send can proceed, at which point the value is transmitted
on the channel.
If the send operation appears in an expression context, the value
of the expression is a boolean and the operation is non-blocking.
The value of the boolean reports true if the communication succeeded,
false if it did not. These two examples are equivalent:
ok := ch <- 3;
if ok { print("sent") } else { print("not sent") }
if ch <- 3 { print("sent") } else { print("not sent") }
In other words, if the program tests the value of a send operation,
the send is non-blocking and the value of the expression is the
success of the operation. If the program does not test the value,
the operation blocks until it succeeds.
TODO: Adjust the above depending on how we rule on the ok semantics.
For instance, does the sent expression get evaluated if ok is false?
The receive operation uses the prefix unary operator "<-".
The value of the expression is the value received:
<-ch
The expression blocks until a value is available, which then can
be assigned to a variable or used like any other expression:
v1 := <-ch
v2 = <-ch
f(<-ch)
If the receive expression does not save the value, the value is
discarded:
<-strobe // wait until clock pulse
If a receive expression is used in a tuple assignment of the form
x, ok = <-ch; // or: x, ok := <-ch
the receive operation becomes non-blocking, and the boolean variable
"ok" will be set to "true" if the receive operation succeeded, and set
to "false" otherwise.
Constant expressions
----
A constant expression is an expression whose operands are all constants
(§Constants). Additionally, the result of the predeclared functions
below (with appropriate arguments) is also constant:
len(a) if a is a fixed array
TODO: Complete this list as needed.
Constant expressions can be evaluated at compile time.
Statements
----
Statements control execution.
Statement =
Declaration | LabelDecl | EmptyStat |
SimpleStat | GoStat | ReturnStat | BreakStat | ContinueStat | GotoStat |
FallthroughStat | Block | IfStat | SwitchStat | SelectStat | ForStat |
RangeStat .
SimpleStat =
ExpressionStat | IncDecStat | Assignment | SimpleVarDecl .
Statements in a statement list are separated by semicolons, which can be
omitted in some cases as expressed by the OptSemicolon production.
StatementList = Statement { OptSemicolon Statement } .
A semicolon may be omitted immediately following:
- a closing parenthesis ")" ending a list of declarations (§Declarations and scope rules)
- a closing brace "}" ending a type declaration (§Type declarations)
- a closing brace "}" ending a block (including switch and select statements)
- a label declaration (§Label declarations)
In all other cases a semicolon is required to separate two statements. Since there
is an empty statement, a statement list can always be ``terminated'' with a semicolon.
Empty statements
----
The empty statement does nothing.
EmptyStat = .
Expression statements
----
ExpressionStat = Expression .
f(x+y)
TODO: specify restrictions. 6g only appears to allow calls here.
IncDec statements
----
The "++" and "--" statements increment or decrement their operands
by the (ideal) constant value 1.
IncDecStat = Expression ( "++" | "--" ) .
The following assignment statements (§Assignments) are semantically
equivalent:
IncDec statement Assignment
x++ x += 1
x-- x -= 1
Both operators apply to integer and floating point types only.
Note that increment and decrement are statements, not expressions.
For instance, "x++" cannot be used as an operand in an expression.
Assignments
----
Assignment = ExpressionList assign_op ExpressionList .
assign_op = [ add_op | mul_op ] "=" .
The left-hand side must be an l-value such as a variable, pointer indirection,
or an array index.
x = 1
*p = f()
a[i] = 23
k = <-ch
As in C, arithmetic binary operators can be combined with assignments:
j <<= 2
A tuple assignment assigns the individual elements of a multi-valued operation,
such as function evaluation or some channel and map operations, into individual
variables. For instance, a tuple assignment such as
v1, v2, v3 = e1, e2, e3
assigns the expressions e1, e2, e3 to temporaries and then assigns the temporaries
to the variables v1, v2, v3. Thus
a, b = b, a
exchanges the values of a and b. The tuple assignment
x, y = f()
calls the function f, which must return two values, and assigns them to x and y.
As a special case, retrieving a value from a map, when written as a two-element
tuple assignment, assign a value and a boolean. If the value is present in the map,
the value is assigned and the second, boolean variable is set to true. Otherwise,
the variable is unchanged, and the boolean value is set to false.
value, present = map_var[key]
To delete a value from a map, use a tuple assignment with the map on the left
and a false boolean expression as the second expression on the right, such
as:
map_var[key] = value, false
In assignments, the type of the expression must match the type of the left-hand side.
If statements
----
If statements specify the conditional execution of two branches; the "if"
and the "else" branch. If Expression evaluates to true,
the "if" branch is executed. Otherwise the "else" branch is executed if present.
If Condition is omitted, it is equivalent to true.
IfStat = "if" [ [ Simplestat ] ";" ] [ Expression ] Block [ "else" Statement ] .
if x > 0 {
return true;
}
An "if" statement may include the declaration of a single temporary variable.
The scope of the declared variable extends to the end of the if statement, and
the variable is initialized once before the statement is entered.
if x := f(); x < y {
return x;
} else if x > z {
return z;
} else {
return y;
}
<!--
TODO: gri thinks that Statement needs to be changed as follows:
IfStat =
"if" [ [ Simplestat ] ";" ] [ Expression ] Block
[ "else" ( IfStat | Block ) ] .
To facilitate the "if else if" code pattern, if the "else" branch is
simply another "if" statement, that "if" statement may be written
without the surrounding Block:
if x > 0 {
return 0;
} else if x > 10 {
return 1;
} else {
return 2;
}
-->
Switch statements
----
Switches provide multi-way execution.
SwitchStat = "switch" [ [ Simplestat ] ";" ] [ Expression ] "{" { CaseClause } "}" .
CaseClause = Case [ StatementList ] .
Case = ( "case" ExpressionList | "default" ) ":" .
There can be at most one default case in a switch statement. In a case clause,
the last statement only may be a fallthrough statement ($Fallthrough statement).
It indicates that the control should flow from the end of this case clause to
the first statement of the next clause.
Each case clause effectively acts as a block for scoping purposes
($Declarations and scope rules).
The expressions do not need to be constants. They will
be evaluated top to bottom until the first successful non-default case is reached.
If none matches and there is a default case, the statements of the default
case are executed.
switch tag {
default: s3()
case 0, 1: s1()
case 2: s2()
}
A switch statement may include the declaration of a single temporary variable.
The scope of the declared variable extends to the end of the switch statement, and
the variable is initialized once before the switch is entered.
switch x := f(); true {
case x < 0: return -x
default: return x
}
Cases do not fall through unless explicitly marked with a "fallthrough" statement.
switch a {
case 1:
b();
fallthrough
case 2:
c();
}
If the expression is omitted, it is equivalent to "true".
switch {
case x < y: f1();
case x < z: f2();
case x == 4: f3();
}
For statements
----
For statements are a combination of the "for" and "while" loops of C.
ForStat = "for" [ Condition | ForClause ] Block .
ForClause = [ InitStat ] ";" [ Condition ] ";" [ PostStat ] .
InitStat = SimpleStat .
Condition = Expression .
PostStat = SimpleStat .
A SimpleStat is a simple statement such as an assignment, a SimpleVarDecl,
or an increment or decrement statement. Therefore one may declare a loop
variable in the init statement.
for i := 0; i < 10; i++ {
print(i, "\n")
}
A for statement with just a condition executes until the condition becomes
false. Thus it is the same as C's while statement.
for a < b {
a *= 2
}
If the condition is absent, it is equivalent to "true".
for {
f()
}
Range statements
----
Range statements are a special control structure for iterating over
the contents of arrays and maps.
RangeStat = "range" IdentifierList ":=" RangeExpression Block .
RangeExpression = Expression .
A range expression must evaluate to an array, map or string. The identifier list must contain
either one or two identifiers. If the range expression is a map, a single identifier is declared
to range over the keys of the map; two identifiers range over the keys and corresponding
values. For arrays and strings, the behavior is analogous for integer indices (the keys) and
array elements (the values).
a := []int(1, 2, 3);
m := [string]map int("fo",2, "foo",3, "fooo",4)
range i := a {
f(a[i]);
}
range i, v := a {
f(v);
}
range k, v := m {
assert(len(k) == v);
}
TODO: is this right?
Go statements
----
A go statement starts the execution of a function as an independent
concurrent thread of control within the same address space. The expression
must evaluate into a function call.
GoStat = "go" Expression .
Unlike with a regular function call, program execution does not wait
for the invoked function to complete.
go Server()
go func(ch chan <- bool) { for { sleep(10); ch <- true; }} (c)
Select statements
----
A select statement chooses which of a set of possible communications
will proceed. It looks similar to a switch statement but with the
cases all referring to communication operations.
SelectStat = "select" "{" { CommClause } "}" .
CommClause = CommCase [ StatementList ] .
CommCase = ( "default" | ( "case" ( SendExpr | RecvExpr) ) ) ":" .
SendExpr = Expression "<-" Expression .
RecvExpr = [ Expression ( "=" | ":=" ) ] "<-" Expression .
Each communication clause acts as a block for the purpose of scoping
(§Declarations and scope rules).
For all the send and receive expressions in the select
statement, the channel expression is evaluated. Any values
that appear on the right hand side of send expressions are also
evaluated. If any of the resulting channels can proceed, one is
chosen and the corresponding communication and statements are
evaluated. Otherwise, if there is a default case, that executes;
if not, the statement blocks until one of the communications can
complete. The channels and send expressions are not re-evaluated.
A channel pointer may be nil, which is equivalent to that case not
being present in the select statement.
Since all the channels and send expressions are evaluated, any side
effects in that evaluation will occur for all the communications
in the select.
If the channel sends or receives an interface type, its
communication can proceed only if the type of the communication
clause matches that of the dynamic value to be exchanged.
If multiple cases can proceed, a uniform fair choice is made regarding
which single communication will execute.
The receive case may declare a new variable (via a ":=" assignment). The
scope of such variables begins immediately after the variable identifier
and ends at the end of the respective "select" case (that is, before the
next "case", "default", or closing brace).
var c, c1, c2 *chan int;
var i1, i2 int;
select {
case i1 = <-c1:
print("received ", i1, " from c1\n");
case c2 <- i2:
print("sent ", i2, " to c2\n");
default:
print("no communication\n");
}
for { // send random sequence of bits to c
select {
case c <- 0: // note: no statement, no fallthrough, no folding of cases
case c <- 1:
}
}
var ca *chan interface {};
var i int;
var f float;
select {
case i = <-ca:
print("received int ", i, " from ca\n");
case f = <-ca:
print("received float ", f, " from ca\n");
}
TODO: Make semantics more precise.
Return statements
----
A return statement terminates execution of the containing function
and optionally provides a result value or values to the caller.
ReturnStat = "return" [ ExpressionList ] .
There are two ways to return values from a function. The first is to
explicitly list the return value or values in the return statement:
func simple_f() int {
return 2;
}
A function may return multiple values.
The syntax of the return clause in that case is the same as
that of a parameter list; in particular, names must be provided for
the elements of the return value.
func complex_f1() (re float, im float) {
return -7.0, -4.0;
}
A second method to return values
is to use those names within the function as variables
to be assigned explicitly; the return statement will then provide no
values:
func complex_f2() (re float, im float) {
re = 7.0;
im = 4.0;
return;
}
Break statements
----
Within a for, switch, or select statement, a break statement terminates
execution of the innermost such statement.
BreakStat = "break" [ identifier ].
If there is an identifier, it must be a label marking an enclosing
for, switch, or select statement, and that is the one whose execution
terminates.
L: for i < n {
switch i {
case 5: break L
}
}
Continue statements
----
Within a for loop a continue statement begins the next iteration of the
loop at the post statement.
ContinueStat = "continue" [ identifier ].
The optional identifier is analogous to that of a break statement.
Label declarations
----
A label declaration serves as the target of a goto, break or continue statement.
LabelDecl = identifier ":" .
Example:
Error:
Goto statements
----
A goto statement transfers control to the corresponding label statement.
GotoStat = "goto" identifier .
goto Error
Executing the goto statement must not cause any variables to come into
scope that were not already in scope at the point of the goto. For
instance, this example:
goto L; // BAD
v := 3;
L:
is erroneous because the jump to label L skips the creation of v.
Fallthrough statements
----
A fallthrough statement transfers control to the first statement of the
next case clause in a switch statement (§Switch statements). It may only
be used in a switch statement, and only as the last statement in a case
clause of the switch statement.
FallthroughStat = "fallthrough" .
Function declarations
----
A function declaration binds an identifier to a function.
Functions contain declarations and statements. They may be
recursive. Except for forward declarations (see below), the parameter
and result types of the function type must all be complete types (§Type declarations).
FunctionDecl = "func" identifier FunctionType [ Block ] .
func min(x int, y int) int {
if x < y {
return x;
}
return y;
}
A function declaration without a block serves as a forward declaration:
func MakeNode(left, right *Node) *Node
Implementation restrictions: Functions can only be declared at the global level.
A function must be declared or forward-declared before it can be invoked.
Method declarations
----
A method declaration is a function declaration with a receiver. The receiver
is the first parameter of the method, and the receiver type must be specified
as a type name, or as a pointer to a type name. The type specified by the
type name is called ``receiver base type''. The receiver base type must be a
type declared in the current file, and it must not be a pointer type.
The method is said to be ``bound'' to the receiver base type; specifically
it is declared within the scope of that type (§Type declarations).
MethodDecl = "func" Receiver identifier FunctionType [ Block ] .
Receiver = "(" identifier [ "*" ] TypeName ")" .
All methods bound to a receiver base type must have the same receiver type:
Either all receiver types are pointers to the base type or they are the base
type. (TODO: This restriction can be relaxed at the cost of more complicated
assignment rules to interface types).
For instance, given type Point, the declarations
func (p *Point) Length() float {
return Math.sqrt(p.x * p.x + p.y * p.y);
}
func (p *Point) Scale(factor float) {
p.x = p.x * factor;
p.y = p.y * factor;
}
bind the methods "Length" and "Scale" to the receiver base type "Point".
Method declarations may appear anywhere after the declaration of the receiver
base type and may be forward-declared.
Predeclared functions
----
cap
convert
len
new
panic
print
typeof
TODO: (gri) suggests that we should consider assert() as a built-in function.
It is like panic, but takes a boolean guard as first argument. (rsc also thinks
this is a good idea).
Length and capacity
----
The predeclared function "len()" takes a value of type string,
array or map type, or of pointer to array or map type, and
returns the length of the string in bytes, or the number of array
of map elements, respectively.
The predeclared function "cap()" takes a value of array or pointer
to array type and returns the number of elements for which there
is space allocated in the array. For an array "a", at any time the
following relationship holds:
0 <= len(a) <= cap(a)
TODO(gri) Change this and the following sections to use a table indexed
by functions and parameter types instead of lots of prose.
Conversions
----
Conversions syntactically look like function calls of the form
T(value)
where "T" is the type name of an arithmetic type or string (§Basic types),
and "value" is the value of an expression which can be converted to a value
of result type "T".
The following conversion rules apply:
1) Between integer types. If the value is a signed quantity, it is
sign extended to implicit infinite precision; otherwise it is zero
extended. It is then truncated to fit in the result type size.
For example, uint32(int8(0xFF)) is 0xFFFFFFFF. The conversion always
yields a valid value; there is no signal for overflow.
2) Between integer and floating point types, or between floating point
types. To avoid overdefining the properties of the conversion, for
now it is defined as a ``best effort'' conversion. The conversion
always succeeds but the value may be a NaN or other problematic
result. TODO: clarify?
3) Strings permit two special conversions.
3a) Converting an integer value yields a string containing the UTF-8
representation of the integer.
string(0x65e5) // "\u65e5"
3b) Converting an array of uint8s yields a string whose successive
bytes are those of the array. (Recall byte is a synonym for uint8.)
string([]byte{'h', 'e', 'l', 'l', 'o'}) // "hello"
There is no linguistic mechanism to convert between pointers
and integers. A library may be provided under restricted circumstances
to acccess this conversion in low-level code.
TODO: Do we allow interface/ptr conversions in this form or do they
have to be written as type guards? (§Type guards)
Allocation
----
The built-in function "new()" takes a type "T", optionally followed by a
type-specific list of expressions. It allocates memory for a variable
of type "T" and returns a pointer of type "*T" to that variable. The
memory is initialized as described in the section on initial values
(§Program initialization and execution).
new(type [, optional list of expressions])
For instance
type S struct { a int; b float }
new(S)
dynamically allocates memory for a variable of type S, initializes it
(a=0, b=0.0), and returns a value of type *S pointing to that variable.
The only defined parameters affect sizes for allocating arrays,
buffered channels, and maps.
ap := new([]int, 10); # a pointer to an open array of 10 ints
c := new(chan int, 10); # a pointer to a channel with a buffer size of 10
m := new(map[string] int, 100); # a pointer to a map with initial space for 100 elements
For arrays, a third argument may be provided to specify the array capacity:
bp := new([]byte, 0, 1024); # a pointer to an empty open array with capacity 1024
<!--
TODO gri thinks that we should not use this notation to specify the capacity
for the following reasons: a) It precludes the future use of that argument as the length
for multi-dimensional open arrays (which we may need at some point) and b) the
effect of "new(T, l, c)" is trivially obtained via "new(T, c)[0 : l]", doesn't
require extra explanation, and leaves options open.
Finally, if there is a performance concern (the single new() may be faster
then the new() with slice, the compiler can trivially rewrite the slice version
into a faster internal call that doesn't do slicing).
-->
Packages
----
A package is a package clause, optionally followed by import declarations,
followed by a series of declarations.
Package = PackageClause { ImportDecl [ ";" ] } { Declaration [ ";" ] } .
The source text following the package clause acts like a block for scoping
purposes ($Declarations and scope rules).
Every source file identifies the package to which it belongs.
The file must begin with a package clause.
PackageClause = "package" PackageName .
package Math
A package can gain access to exported items from another package
through an import declaration:
ImportDecl = "import" Decl<ImportSpec> .
ImportSpec = [ "." | PackageName ] PackageFileName .
An import statement makes the exported contents of the named
package file accessible in this package.
In the following discussion, assume we have a package in the
file "/lib/math", called package Math, which exports functions sin
and cos.
In the general form, with an explicit package name, the import
statement declares that package name as an identifier whose
contents are the exported elements of the imported package.
For instance, after
import M "/lib/math"
the contents of the package /lib/math can be accessed by
M.cos, M.sin, etc.
In its simplest form, with no package name, the import statement
implicitly uses the imported package name itself as the local
package name. After
import "/lib/math"
the contents are accessible by Math.sin, Math.cos.
Finally, if instead of a package name the import statement uses
an explicit period, the contents of the imported package are added
to the current package. After
import . "/lib/math"
the contents are accessible by sin and cos. In this instance, it is
an error if the import introduces name conflicts.
Here is a complete example Go package that implements a concurrent prime sieve:
package main
// Send the sequence 2, 3, 4, ... to channel 'ch'.
func Generate(ch *chan <- int) {
for i := 2; ; i++ {
ch <- i // Send 'i' to channel 'ch'.
}
}
// Copy the values from channel 'in' to channel 'out',
// removing those divisible by 'prime'.
func Filter(in *chan <- int, out *<-chan int, prime int) {
for {
i := <-in; // Receive value of new variable 'i' from 'in'.
if i % prime != 0 {
out <- i // Send 'i' to channel 'out'.
}
}
}
// The prime sieve: Daisy-chain Filter processes together.
func Sieve() {
ch := new(chan int); // Create a new channel.
go Generate(ch); // Start Generate() as a subprocess.
for {
prime := <-ch;
print(prime, "\n");
ch1 := new(chan int);
go Filter(ch, ch1, prime);
ch = ch1
}
}
func main() {
Sieve()
}
Program initialization and execution
----
When memory is allocated to store a value, either through a declaration
or "new()", and no explicit initialization is provided, the memory is
given a default initialization. Each element of such a value is
set to the ``zero'' for that type: "false" for booleans, "0" for integers,
"0.0" for floats, '''' for strings, and "nil" for pointers and interfaces.
This intialization is done recursively, so for instance each element of an
array of integers will be set to 0 if no other value is specified.
These two simple declarations are equivalent:
var i int;
var i int = 0;
After
type T struct { i int; f float; next *T };
t := new(T);
the following holds:
t.i == 0
t.f == 0.0
t.next == nil
A package with no imports is initialized by assigning initial values to
all its global variables in declaration order and then calling any init()
functions defined in its source. Since a package may contain more
than one source file, there may be more than one init() function, but
only one per source file.
Initialization code may contain "go" statements, but the functions
they invoke do not begin execution until initialization is complete.
Therefore, all initialization code is run in a single thread of
execution.
Furthermore, an "init()" function cannot be referred to from anywhere
in a program. In particular, "init()" cannot be called explicitly, nor
can a pointer to "init" be assigned to a function variable).
If a package has imports, the imported packages are initialized
before initializing the package itself. If multiple packages import
a package P, P will be initialized only once.
The importing of packages, by construction, guarantees that there can
be no cyclic dependencies in initialization.
A complete program, possibly created by linking multiple packages,
must have one package called main, with a function
func main() { ... }
defined. The function main.main() takes no arguments and returns no
value.
Program execution begins by initializing the main package and then
invoking main.main().
When main.main() returns, the program exits.
TODO: is there a way to override the default for package main or the
default for the function name main.main?
<!--
----
----
UNUSED PARTS OF OLD DOCUMENT go_lang.txt - KEEP AROUND UNTIL NOT NEEDED ANYMORE
----
Guiding principles
----
Go is a new systems programming language intended as an alternative to C++ at
Google. Its main purpose is to provide a productive and efficient programming
environment for compiled programs such as servers and distributed systems.
The design is motivated by the following guidelines:
- very fast compilation (1MLOC/s stretch goal); instantaneous incremental compilation
- procedural
- strongly typed
- concise syntax avoiding repetition
- few, orthogonal, and general concepts
- support for threading and interprocess communication
- garbage collection
- container library written in Go
- reasonably efficient (C ballpark)
The language should be strong enough that the compiler and run time can be
written in itself.
Program structure
----
A Go program consists of a number of ``packages''.
A package is built from one or more source files, each of which consists
of a package specifier followed by import declarations followed by other
declarations. There are no statements at the top level of a file.
By convention, one package, by default called main, is the starting point for
execution. It contains a function, also called main, that is the first function
invoked by the run time system.
If a source file within the program
contains a function init(), that function will be executed
before main.main() is called.
Source files can be compiled separately (without the source
code of packages they depend on), but not independently (the compiler does
check dependencies by consulting the symbol information in compiled packages).
Modularity, identifiers and scopes
----
A package is a collection of import, constant, type, variable, and function
declarations. Each declaration associates an ``identifier'' with a program
entity (such as a type).
In particular, all identifiers in a package are either
declared explicitly within the package, arise from an import statement,
or belong to a small set of predefined identifiers (such as "int32").
A package may make explicitly declared identifiers visible to other
packages by marking them as exported; there is no ``header file''.
Imported identifiers cannot be re-exported.
Scoping is essentially the same as in C: The scope of an identifier declared
within a ``block'' extends from the declaration of the identifier (that is, the
position immediately after the identifier) to the end of the block. An identifier
shadows identifiers with the same name declared in outer scopes. Within a
block, a particular identifier must be declared at most once.
Typing, polymorphism, and object-orientation
----
Go programs are strongly typed. Certain values can also be
polymorphic. The language provides mechanisms to make use of such
polymorphic values type-safe.
Interface types provide the mechanisms to support object-oriented
programming. Different interface types are independent of each
other and no explicit hierarchy is required (such as single or
multiple inheritance explicitly specified through respective type
declarations). Interface types only define a set of methods that a
corresponding implementation must provide. Thus interface and
implementation are strictly separated.
An interface is implemented by associating methods with types.
If a type defines all methods of an interface, it
implements that interface and thus can be used where that interface is
required. Unless used through a variable of interface type, methods
can always be statically bound (they are not ``virtual''), and incur no
run-time overhead compared to an ordinary function.
[OLD
Interface types, building on structures with methods, provide
the mechanisms to support object-oriented programming.
Different interface types are independent of each
other and no explicit hierarchy is required (such as single or
multiple inheritance explicitly specified through respective type
declarations). Interface types only define a set of methods that a
corresponding implementation must provide. Thus interface and
implementation are strictly separated.
An interface is implemented by associating methods with
structures. If a structure implements all methods of an interface, it
implements that interface and thus can be used where that interface is
required. Unless used through a variable of interface type, methods
can always be statically bound (they are not ``virtual''), and incur no
run-time overhead compared to an ordinary function.
END]
Go has no explicit notion of classes, sub-classes, or inheritance.
These concepts are trivially modeled in Go through the use of
functions, structures, associated methods, and interfaces.
Go has no explicit notion of type parameters or templates. Instead,
containers (such as stacks, lists, etc.) are implemented through the
use of abstract operations on interface types or polymorphic values.
Pointers and garbage collection
----
Variables may be allocated automatically (when entering the scope of
the variable) or explicitly on the heap. Pointers are used to refer
to heap-allocated variables. Pointers may also be used to point to
any other variable; such a pointer is obtained by "taking the
address" of that variable. Variables are automatically reclaimed when
they are no longer accessible. There is no pointer arithmetic in Go.
Multithreading and channels
----
Go supports multithreaded programming directly. A function may
be invoked as a parallel thread of execution. Communication and
synchronization are provided through channels and their associated
language support.
Values and references
----
All objects have value semantics, but their contents may be accessed
through different pointers referring to the same object.
For example, when calling a function with an array, the array is
passed by value, possibly by making a copy. To pass a reference,
one must explicitly pass a pointer to the array. For arrays in
particular, this is different from C.
There is also a built-in string type, which represents immutable
strings of bytes.
Interface of a type
----
The interface of a type is defined to be the unordered set of methods
associated with that type. Methods are defined in a later section;
they are functions bound to a type.
[OLD
It is legal to assign a pointer to a struct to a variable of
compatible interface type. It is legal to assign an interface
variable to any struct pointer variable but if the struct type is
incompatible the result will be nil.
END]
[OLD
The polymorphic "any" type
----
Given a variable of type "any", one can store any value into it by
plain assignment or implicitly, such as through a function parameter
or channel operation. Given an "any" variable v storing an underlying
value of type T, one may:
- copy v's value to another variable of type "any"
- extract the stored value by an explicit conversion operation T(v)
- copy v's value to a variable of type T
Attempts to convert/extract to an incompatible type will yield nil.
No other operations are defined (yet).
Note that type
interface {}
is a special case that can match any struct type, while type
any
can match any type at all, including basic types, arrays, etc.
TODO: details about reflection
END]
[OLD
The nil value
----
The predeclared constant
nil
represents the ``zero'' value for a pointer type or interface type.
The only operations allowed for nil are to assign it to a pointer or
interface variable and to compare it for equality or inequality with a
pointer or interface value.
var p *int;
if p != nil {
print(p)
} else {
print("p points nowhere")
}
By default, pointers are initialized to nil.
TODO: This needs to be revisited.
-->