| <!--{ |
| "Title": "The Go Memory Model", |
| "Subtitle": "Version of June 6, 2022", |
| "Path": "/ref/mem" |
| }--> |
| |
| <style> |
| p.rule { |
| font-style: italic; |
| } |
| </style> |
| |
| <h2 id="introduction">Introduction</h2> |
| |
| <p> |
| The Go memory model specifies the conditions under which |
| reads of a variable in one goroutine can be guaranteed to |
| observe values produced by writes to the same variable in a different goroutine. |
| </p> |
| |
| |
| <h3 id="advice">Advice</h3> |
| |
| <p> |
| Programs that modify data being simultaneously accessed by multiple goroutines |
| must serialize such access. |
| </p> |
| |
| <p> |
| To serialize access, protect the data with channel operations or other synchronization primitives |
| such as those in the <a href="/pkg/sync/"><code>sync</code></a> |
| and <a href="/pkg/sync/atomic/"><code>sync/atomic</code></a> packages. |
| </p> |
| |
| <p> |
| If you must read the rest of this document to understand the behavior of your program, |
| you are being too clever. |
| </p> |
| |
| <p> |
| Don't be clever. |
| </p> |
| |
| <h3 id="overview">Informal Overview</h3> |
| |
| <p> |
| Go approaches its memory model in much the same way as the rest of the language, |
| aiming to keep the semantics simple, understandable, and useful. |
| This section gives a general overview of the approach and should suffice for most programmers. |
| The memory model is specified more formally in the next section. |
| </p> |
| |
| <p> |
| A <em>data race</em> is defined as |
| a write to a memory location happening concurrently with another read or write to that same location, |
| unless all the accesses involved are atomic data accesses as provided by the <code>sync/atomic</code> package. |
| As noted already, programmers are strongly encouraged to use appropriate synchronization |
| to avoid data races. |
| In the absence of data races, Go programs behave as if all the goroutines |
| were multiplexed onto a single processor. |
| This property is sometimes referred to as DRF-SC: data-race-free programs |
| execute in a sequentially consistent manner. |
| </p> |
| |
| <p> |
| While programmers should write Go programs without data races, |
| there are limitations to what a Go implementation can do in response to a data race. |
| An implementation may always react to a data race by reporting the race and terminating the program. |
| Otherwise, each read of a single-word-sized or sub-word-sized memory location |
| must observe a value actually written to that location (perhaps by a concurrent executing goroutine) |
| and not yet overwritten. |
| These implementation constraints make Go more like Java or JavaScript, |
| in that most races have a limited number of outcomes, |
| and less like C and C++, where the meaning of any program with a race |
| is entirely undefined, and the compiler may do anything at all. |
| Go's approach aims to make errant programs more reliable and easier to debug, |
| while still insisting that races are errors and that tools can diagnose and report them. |
| </p> |
| |
| <h2 id="model">Memory Model</h2> |
| |
| <p> |
| The following formal definition of Go's memory model closely follows |
| the approach presented by Hans-J. Boehm and Sarita V. Adve in |
| “<a href="https://www.hpl.hp.com/techreports/2008/HPL-2008-56.pdf">Foundations of the C++ Concurrency Memory Model</a>”, |
| published in PLDI 2008. |
| The definition of data-race-free programs and the guarantee of sequential consistency |
| for race-free programs are equivalent to the ones in that work. |
| </p> |
| |
| <p> |
| The memory model describes the requirements on program executions, |
| which are made up of goroutine executions, |
| which in turn are made up of memory operations. |
| </p> |
| |
| <p> |
| A <i>memory operation</i> is modeled by four details: |
| </p> |
| <ul> |
| <li>its kind, indicating whether it is an ordinary data read, an ordinary data write, |
| or a <i>synchronizing operation</i> such as an atomic data access, |
| a mutex operation, or a channel operation, |
| <li>its location in the program, |
| <li>the memory location or variable being accessed, and |
| <li>the values read or written by the operation. |
| </ul> |
| <p> |
| Some memory operations are <i>read-like</i>, including read, atomic read, mutex lock, and channel receive. |
| Other memory operations are <i>write-like</i>, including write, atomic write, mutex unlock, channel send, and channel close. |
| Some, such as atomic compare-and-swap, are both read-like and write-like. |
| </p> |
| |
| <p> |
| A <i>goroutine execution</i> is modeled as a set of memory operations executed by a single goroutine. |
| </p> |
| |
| <p> |
| <b>Requirement 1</b>: |
| The memory operations in each goroutine must correspond to a correct sequential execution of that goroutine, |
| given the values read from and written to memory. |
| That execution must be consistent with the <i>sequenced before</i> relation, |
| defined as the partial order requirements set out by the <a href="/ref/spec">Go language specification</a> |
| for Go's control flow constructs as well as the <a href="/ref/spec#Order_of_evaluation">order of evaluation for expressions</a>. |
| </p> |
| |
| <p> |
| A Go <i>program execution</i> is modeled as a set of goroutine executions, |
| together with a mapping <i>W</i> that specifies the write-like operation that each read-like operation reads from. |
| (Multiple executions of the same program can have different program executions.) |
| </p> |
| |
| <p> |
| <b>Requirement 2</b>: |
| For a given program execution, the mapping <i>W</i>, when limited to synchronizing operations, |
| must be explainable by some implicit total order of the synchronizing operations |
| that is consistent with sequencing and the values read and written by those operations. |
| </p> |
| |
| <p> |
| The <i>synchronized before</i> relation is a partial order on synchronizing memory operations, |
| derived from <i>W</i>. |
| If a synchronizing read-like memory operation <i>r</i> |
| observes a synchronizing write-like memory operation <i>w</i> |
| (that is, if <i>W</i>(<i>r</i>) = <i>w</i>), |
| then <i>w</i> is synchronized before <i>r</i>. |
| Informally, the synchronized before relation is a subset of the implied total order |
| mentioned in the previous paragraph, |
| limited to the information that <i>W</i> directly observes. |
| </p> |
| |
| <p> |
| The <i>happens before</i> relation is defined as the transitive closure of the |
| union of the sequenced before and synchronized before relations. |
| </p> |
| |
| <p> |
| <b>Requirement 3</b>: |
| For an ordinary (non-synchronizing) data read <i>r</i> on a memory location <i>x</i>, |
| <i>W</i>(<i>r</i>) must be a write <i>w</i> that is <i>visible</i> to <i>r</i>, |
| where visible means that both of the following hold: |
| |
| <ol> |
| <li><i>w</i> happens before <i>r</i>. |
| <li><i>w</i> does not happen before any other write <i>w'</i> (to <i>x</i>) that happens before <i>r</i>. |
| </ol> |
| |
| <p> |
| A <i>read-write data race</i> on memory location <i>x</i> |
| consists of a read-like memory operation <i>r</i> on <i>x</i> |
| and a write-like memory operation <i>w</i> on <i>x</i>, |
| at least one of which is non-synchronizing, |
| which are unordered by happens before |
| (that is, neither <i>r</i> happens before <i>w</i> |
| nor <i>w</i> happens before <i>r</i>). |
| </p> |
| |
| <p> |
| A <i>write-write data race</i> on memory location <i>x</i> |
| consists of two write-like memory operations <i>w</i> and <i>w'</i> on <i>x</i>, |
| at least one of which is non-synchronizing, |
| which are unordered by happens before. |
| </p> |
| |
| <p> |
| Note that if there are no read-write or write-write data races on memory location <i>x</i>, |
| then any read <i>r</i> on <i>x</i> has only one possible <i>W</i>(<i>r</i>): |
| the single <i>w</i> that immediately precedes it in the happens before order. |
| </p> |
| |
| <p> |
| More generally, it can be shown that any Go program that is data-race-free, |
| meaning it has no program executions with read-write or write-write data races, |
| can only have outcomes explained by some sequentially consistent interleaving |
| of the goroutine executions. |
| (The proof is the same as Section 7 of Boehm and Adve's paper cited above.) |
| This property is called DRF-SC. |
| </p> |
| |
| <p> |
| The intent of the formal definition is to match |
| the DRF-SC guarantee provided to race-free programs |
| by other languages, including C, C++, Java, JavaScript, Rust, and Swift. |
| </p> |
| |
| <p> |
| Certain Go language operations such as goroutine creation and memory allocation |
| act as synchronization operations. |
| The effect of these operations on the synchronized-before partial order |
| is documented in the “Synchronization” section below. |
| Individual packages are responsible for providing similar documentation |
| for their own operations. |
| </p> |
| |
| <h2 id="restrictions">Implementation Restrictions for Programs Containing Data Races</h2> |
| |
| <p> |
| The preceding section gave a formal definition of data-race-free program execution. |
| This section informally describes the semantics that implementations must provide |
| for programs that do contain races. |
| </p> |
| |
| <p> |
| First, any implementation can, upon detecting a data race, |
| report the race and halt execution of the program. |
| Implementations using ThreadSanitizer |
| (accessed with “<code>go</code> <code>build</code> <code>-race</code>”) |
| do exactly this. |
| </p> |
| |
| <p> |
| Otherwise, a read <i>r</i> of a memory location <i>x</i> |
| that is not larger than a machine word must observe |
| some write <i>w</i> such that <i>r</i> does not happen before <i>w</i> |
| and there is no write <i>w'</i> such that <i>w</i> happens before <i>w'</i> |
| and <i>w'</i> happens before <i>r</i>. |
| That is, each read must observe a value written by a preceding or concurrent write. |
| </p> |
| |
| <p> |
| Additionally, observation of acausal and “out of thin air” writes is disallowed. |
| </p> |
| |
| <p> |
| Reads of memory locations larger than a single machine word |
| are encouraged but not required to meet the same semantics |
| as word-sized memory locations, |
| observing a single allowed write <i>w</i>. |
| For performance reasons, |
| implementations may instead treat larger operations |
| as a set of individual machine-word-sized operations |
| in an unspecified order. |
| This means that races on multiword data structures |
| can lead to inconsistent values not corresponding to a single write. |
| When the values depend on the consistency |
| of internal (pointer, length) or (pointer, type) pairs, |
| as can be the case for interface values, maps, |
| slices, and strings in most Go implementations, |
| such races can in turn lead to arbitrary memory corruption. |
| </p> |
| |
| <p> |
| Examples of incorrect synchronization are given in the |
| “Incorrect synchronization” section below. |
| </p> |
| |
| <p> |
| Examples of the limitations on implementations are given in the |
| “Incorrect compilation” section below. |
| </p> |
| |
| <h2 id="synchronization">Synchronization</h2> |
| |
| <h3 id="init">Initialization</h3> |
| |
| <p> |
| Program initialization runs in a single goroutine, |
| but that goroutine may create other goroutines, |
| which run concurrently. |
| </p> |
| |
| <p class="rule"> |
| If a package <code>p</code> imports package <code>q</code>, the completion of |
| <code>q</code>'s <code>init</code> functions happens before the start of any of <code>p</code>'s. |
| </p> |
| |
| <p class="rule"> |
| The completion of all <code>init</code> functions is synchronized before |
| the start of the function <code>main.main</code>. |
| </p> |
| |
| <h3 id="go">Goroutine creation</h3> |
| |
| <p class="rule"> |
| The <code>go</code> statement that starts a new goroutine |
| is synchronized before the start of the goroutine's execution. |
| </p> |
| |
| <p> |
| For example, in this program: |
| </p> |
| |
| <pre> |
| var a string |
| |
| func f() { |
| print(a) |
| } |
| |
| func hello() { |
| a = "hello, world" |
| go f() |
| } |
| </pre> |
| |
| <p> |
| calling <code>hello</code> will print <code>"hello, world"</code> |
| at some point in the future (perhaps after <code>hello</code> has returned). |
| </p> |
| |
| <h3 id="goexit">Goroutine destruction</h3> |
| |
| <p> |
| The exit of a goroutine is not guaranteed to be synchronized before |
| any event in the program. |
| For example, in this program: |
| </p> |
| |
| <pre> |
| var a string |
| |
| func hello() { |
| go func() { a = "hello" }() |
| print(a) |
| } |
| </pre> |
| |
| <p> |
| the assignment to <code>a</code> is not followed by |
| any synchronization event, so it is not guaranteed to be |
| observed by any other goroutine. |
| In fact, an aggressive compiler might delete the entire <code>go</code> statement. |
| </p> |
| |
| <p> |
| If the effects of a goroutine must be observed by another goroutine, |
| use a synchronization mechanism such as a lock or channel |
| communication to establish a relative ordering. |
| </p> |
| |
| <h3 id="chan">Channel communication</h3> |
| |
| <p> |
| Channel communication is the main method of synchronization |
| between goroutines. Each send on a particular channel |
| is matched to a corresponding receive from that channel, |
| usually in a different goroutine. |
| </p> |
| |
| <p class="rule"> |
| A send on a channel is synchronized before the completion of the |
| corresponding receive from that channel. |
| </p> |
| |
| <p> |
| This program: |
| </p> |
| |
| <pre> |
| var c = make(chan int, 10) |
| var a string |
| |
| func f() { |
| a = "hello, world" |
| c <- 0 |
| } |
| |
| func main() { |
| go f() |
| <-c |
| print(a) |
| } |
| </pre> |
| |
| <p> |
| is guaranteed to print <code>"hello, world"</code>. The write to <code>a</code> |
| is sequenced before the send on <code>c</code>, which is synchronized before |
| the corresponding receive on <code>c</code> completes, which is sequenced before |
| the <code>print</code>. |
| </p> |
| |
| <p class="rule"> |
| The closing of a channel is synchronized before a receive that returns a zero value |
| because the channel is closed. |
| </p> |
| |
| <p> |
| In the previous example, replacing |
| <code>c <- 0</code> with <code>close(c)</code> |
| yields a program with the same guaranteed behavior. |
| </p> |
| |
| <p class="rule"> |
| A receive from an unbuffered channel is synchronized before the completion of |
| the corresponding send on that channel. |
| </p> |
| |
| <p> |
| This program (as above, but with the send and receive statements swapped and |
| using an unbuffered channel): |
| </p> |
| |
| <pre> |
| var c = make(chan int) |
| var a string |
| |
| func f() { |
| a = "hello, world" |
| <-c |
| } |
| |
| func main() { |
| go f() |
| c <- 0 |
| print(a) |
| } |
| </pre> |
| |
| <p> |
| is also guaranteed to print <code>"hello, world"</code>. The write to <code>a</code> |
| is sequenced before the receive on <code>c</code>, which is synchronized before |
| the corresponding send on <code>c</code> completes, which is sequenced |
| before the <code>print</code>. |
| </p> |
| |
| <p> |
| If the channel were buffered (e.g., <code>c = make(chan int, 1)</code>) |
| then the program would not be guaranteed to print |
| <code>"hello, world"</code>. (It might print the empty string, |
| crash, or do something else.) |
| </p> |
| |
| <p class="rule"> |
| The <i>k</i>th receive on a channel with capacity <i>C</i> is synchronized before the completion of the <i>k</i>+<i>C</i>th send from that channel completes. |
| </p> |
| |
| <p> |
| This rule generalizes the previous rule to buffered channels. |
| It allows a counting semaphore to be modeled by a buffered channel: |
| the number of items in the channel corresponds to the number of active uses, |
| the capacity of the channel corresponds to the maximum number of simultaneous uses, |
| sending an item acquires the semaphore, and receiving an item releases |
| the semaphore. |
| This is a common idiom for limiting concurrency. |
| </p> |
| |
| <p> |
| This program starts a goroutine for every entry in the work list, but the |
| goroutines coordinate using the <code>limit</code> channel to ensure |
| that at most three are running work functions at a time. |
| </p> |
| |
| <pre> |
| var limit = make(chan int, 3) |
| |
| func main() { |
| for _, w := range work { |
| go func(w func()) { |
| limit <- 1 |
| w() |
| <-limit |
| }(w) |
| } |
| select{} |
| } |
| </pre> |
| |
| <h3 id="locks">Locks</h3> |
| |
| <p> |
| The <code>sync</code> package implements two lock data types, |
| <code>sync.Mutex</code> and <code>sync.RWMutex</code>. |
| </p> |
| |
| <p class="rule"> |
| For any <code>sync.Mutex</code> or <code>sync.RWMutex</code> variable <code>l</code> and <i>n</i> < <i>m</i>, |
| call <i>n</i> of <code>l.Unlock()</code> is synchronized before call <i>m</i> of <code>l.Lock()</code> returns. |
| </p> |
| |
| <p> |
| This program: |
| </p> |
| |
| <pre> |
| var l sync.Mutex |
| var a string |
| |
| func f() { |
| a = "hello, world" |
| l.Unlock() |
| } |
| |
| func main() { |
| l.Lock() |
| go f() |
| l.Lock() |
| print(a) |
| } |
| </pre> |
| |
| <p> |
| is guaranteed to print <code>"hello, world"</code>. |
| The first call to <code>l.Unlock()</code> (in <code>f</code>) is synchronized |
| before the second call to <code>l.Lock()</code> (in <code>main</code>) returns, |
| which is sequenced before the <code>print</code>. |
| </p> |
| |
| <p class="rule"> |
| For any call to <code>l.RLock</code> on a <code>sync.RWMutex</code> variable <code>l</code>, |
| there is an <i>n</i> such that the <i>n</i>th call to <code>l.Unlock</code> |
| is synchronized before the return from <code>l.RLock</code>, |
| and the matching call to <code>l.RUnlock</code> is synchronized before the return from call <i>n</i>+1 to <code>l.Lock</code>. |
| </p> |
| |
| <p class="rule"> |
| A successful call to <code>l.TryLock</code> (or <code>l.TryRLock</code>) |
| is equivalent to a call to <code>l.Lock</code> (or <code>l.RLock</code>). |
| An unsuccessful call has no synchronizing effect at all. |
| As far as the memory model is concerned, |
| <code>l.TryLock</code> (or <code>l.TryRLock</code>) |
| may be considered to be able to return false |
| even when the mutex <i>l</i> is unlocked. |
| </p> |
| |
| <h3 id="once">Once</h3> |
| |
| <p> |
| The <code>sync</code> package provides a safe mechanism for |
| initialization in the presence of multiple goroutines |
| through the use of the <code>Once</code> type. |
| Multiple threads can execute <code>once.Do(f)</code> for a particular <code>f</code>, |
| but only one will run <code>f()</code>, and the other calls block |
| until <code>f()</code> has returned. |
| </p> |
| |
| <p class="rule"> |
| The completion of a single call of <code>f()</code> from <code>once.Do(f)</code> |
| is synchronized before the return of any call of <code>once.Do(f)</code>. |
| </p> |
| |
| <p> |
| In this program: |
| </p> |
| |
| <pre> |
| var a string |
| var once sync.Once |
| |
| func setup() { |
| a = "hello, world" |
| } |
| |
| func doprint() { |
| once.Do(setup) |
| print(a) |
| } |
| |
| func twoprint() { |
| go doprint() |
| go doprint() |
| } |
| </pre> |
| |
| <p> |
| calling <code>twoprint</code> will call <code>setup</code> exactly |
| once. |
| The <code>setup</code> function will complete before either call |
| of <code>print</code>. |
| The result will be that <code>"hello, world"</code> will be printed |
| twice. |
| </p> |
| |
| <h3 id="atomic">Atomic Values</h3> |
| |
| <p> |
| The APIs in the <a href="/pkg/sync/atomic/"><code>sync/atomic</code></a> |
| package are collectively “atomic operations” |
| that can be used to synchronize the execution of different goroutines. |
| If the effect of an atomic operation <i>A</i> is observed by atomic operation <i>B</i>, |
| then <i>A</i> is synchronized before <i>B</i>. |
| All the atomic operations executed in a program behave as though executed |
| in some sequentially consistent order. |
| </p> |
| |
| <p> |
| The preceding definition has the same semantics as C++’s sequentially consistent atomics |
| and Java’s <code>volatile</code> variables. |
| </p> |
| |
| <h3 id="finalizer">Finalizers</h3> |
| |
| <p> |
| The <a href="/pkg/runtime/"><code>runtime</code></a> package provides |
| a <code>SetFinalizer</code> function that adds a finalizer to be called when |
| a particular object is no longer reachable by the program. |
| A call to <code>SetFinalizer(x, f)</code> is synchronized before the finalization call <code>f(x)</code>. |
| </p> |
| |
| <h3 id="more">Additional Mechanisms</h3> |
| |
| <p> |
| The <code>sync</code> package provides additional synchronization abstractions, |
| including <a href="/pkg/sync/#Cond">condition variables</a>, |
| <a href="/pkg/sync/#Map">lock-free maps</a>, |
| <a href="/pkg/sync/#Pool">allocation pools</a>, |
| and |
| <a href="/pkg/sync/#WaitGroup">wait groups</a>. |
| The documentation for each of these specifies the guarantees it |
| makes concerning synchronization. |
| </p> |
| |
| <p> |
| Other packages that provide synchronization abstractions |
| should document the guarantees they make too. |
| </p> |
| |
| |
| <h2 id="badsync">Incorrect synchronization</h2> |
| |
| <p> |
| Programs with races are incorrect and |
| can exhibit non-sequentially consistent executions. |
| In particular, note that a read <i>r</i> may observe the value written by any write <i>w</i> |
| that executes concurrently with <i>r</i>. |
| Even if this occurs, it does not imply that reads happening after <i>r</i> |
| will observe writes that happened before <i>w</i>. |
| </p> |
| |
| <p> |
| In this program: |
| </p> |
| |
| <pre> |
| var a, b int |
| |
| func f() { |
| a = 1 |
| b = 2 |
| } |
| |
| func g() { |
| print(b) |
| print(a) |
| } |
| |
| func main() { |
| go f() |
| g() |
| } |
| </pre> |
| |
| <p> |
| it can happen that <code>g</code> prints <code>2</code> and then <code>0</code>. |
| </p> |
| |
| <p> |
| This fact invalidates a few common idioms. |
| </p> |
| |
| <p> |
| Double-checked locking is an attempt to avoid the overhead of synchronization. |
| For example, the <code>twoprint</code> program might be |
| incorrectly written as: |
| </p> |
| |
| <pre> |
| var a string |
| var done bool |
| |
| func setup() { |
| a = "hello, world" |
| done = true |
| } |
| |
| func doprint() { |
| if !done { |
| once.Do(setup) |
| } |
| print(a) |
| } |
| |
| func twoprint() { |
| go doprint() |
| go doprint() |
| } |
| </pre> |
| |
| <p> |
| but there is no guarantee that, in <code>doprint</code>, observing the write to <code>done</code> |
| implies observing the write to <code>a</code>. This |
| version can (incorrectly) print an empty string |
| instead of <code>"hello, world"</code>. |
| </p> |
| |
| <p> |
| Another incorrect idiom is busy waiting for a value, as in: |
| </p> |
| |
| <pre> |
| var a string |
| var done bool |
| |
| func setup() { |
| a = "hello, world" |
| done = true |
| } |
| |
| func main() { |
| go setup() |
| for !done { |
| } |
| print(a) |
| } |
| </pre> |
| |
| <p> |
| As before, there is no guarantee that, in <code>main</code>, |
| observing the write to <code>done</code> |
| implies observing the write to <code>a</code>, so this program could |
| print an empty string too. |
| Worse, there is no guarantee that the write to <code>done</code> will ever |
| be observed by <code>main</code>, since there are no synchronization |
| events between the two threads. The loop in <code>main</code> is not |
| guaranteed to finish. |
| </p> |
| |
| <p> |
| There are subtler variants on this theme, such as this program. |
| </p> |
| |
| <pre> |
| type T struct { |
| msg string |
| } |
| |
| var g *T |
| |
| func setup() { |
| t := new(T) |
| t.msg = "hello, world" |
| g = t |
| } |
| |
| func main() { |
| go setup() |
| for g == nil { |
| } |
| print(g.msg) |
| } |
| </pre> |
| |
| <p> |
| Even if <code>main</code> observes <code>g != nil</code> and exits its loop, |
| there is no guarantee that it will observe the initialized |
| value for <code>g.msg</code>. |
| </p> |
| |
| <p> |
| In all these examples, the solution is the same: |
| use explicit synchronization. |
| </p> |
| |
| <h2 id="badcompiler">Incorrect compilation</h2> |
| |
| <p> |
| The Go memory model restricts compiler optimizations as much as it does Go programs. |
| Some compiler optimizations that would be valid in single-threaded programs are not valid in all Go programs. |
| In particular, a compiler must not introduce writes that do not exist in the original program, |
| it must not allow a single read to observe multiple values, |
| and it must not allow a single write to write multiple values. |
| </p> |
| |
| <p> |
| All the following examples assume that `*p` and `*q` refer to |
| memory locations accessible to multiple goroutines. |
| </p> |
| |
| <p> |
| Not introducing data races into race-free programs means not moving |
| writes out of conditional statements in which they appear. |
| For example, a compiler must not invert the conditional in this program: |
| </p> |
| |
| <pre> |
| *p = 1 |
| if cond { |
| *p = 2 |
| } |
| </pre> |
| |
| <p> |
| That is, the compiler must not rewrite the program into this one: |
| </p> |
| |
| <pre> |
| *p = 2 |
| if !cond { |
| *p = 1 |
| } |
| </pre> |
| |
| <p> |
| If <code>cond</code> is false and another goroutine is reading <code>*p</code>, |
| then in the original program, the other goroutine can only observe any prior value of <code>*p</code> and <code>1</code>. |
| In the rewritten program, the other goroutine can observe <code>2</code>, which was previously impossible. |
| </p> |
| |
| <p> |
| Not introducing data races also means not assuming that loops terminate. |
| For example, a compiler must in general not move the accesses to <code>*p</code> or <code>*q</code> |
| ahead of the loop in this program: |
| </p> |
| |
| <pre> |
| n := 0 |
| for e := list; e != nil; e = e.next { |
| n++ |
| } |
| i := *p |
| *q = 1 |
| </pre> |
| |
| <p> |
| If <code>list</code> pointed to a cyclic list, |
| then the original program would never access <code>*p</code> or <code>*q</code>, |
| but the rewritten program would. |
| (Moving `*p` ahead would be safe if the compiler can prove `*p` will not panic; |
| moving `*q` ahead would also require the compiler proving that no other |
| goroutine can access `*q`.) |
| </p> |
| |
| <p> |
| Not introducing data races also means not assuming that called functions |
| always return or are free of synchronization operations. |
| For example, a compiler must not move the accesses to <code>*p</code> or <code>*q</code> |
| ahead of the function call in this program |
| (at least not without direct knowledge of the precise behavior of <code>f</code>): |
| </p> |
| |
| <pre> |
| f() |
| i := *p |
| *q = 1 |
| </pre> |
| |
| <p> |
| If the call never returned, then once again the original program |
| would never access <code>*p</code> or <code>*q</code>, but the rewritten program would. |
| And if the call contained synchronizing operations, then the original program |
| could establish happens before edges preceding the accesses |
| to <code>*p</code> and <code>*q</code>, but the rewritten program would not. |
| </p> |
| |
| <p> |
| Not allowing a single read to observe multiple values means |
| not reloading local variables from shared memory. |
| For example, a compiler must not discard <code>i</code> and reload it |
| a second time from <code>*p</code> in this program: |
| </p> |
| |
| <pre> |
| i := *p |
| if i < 0 || i >= len(funcs) { |
| panic("invalid function index") |
| } |
| ... complex code ... |
| // compiler must NOT reload i = *p here |
| funcs[i]() |
| </pre> |
| |
| <p> |
| If the complex code needs many registers, a compiler for single-threaded programs |
| could discard <code>i</code> without saving a copy and then reload |
| <code>i = *p</code> just before |
| <code>funcs[i]()</code>. |
| A Go compiler must not, because the value of <code>*p</code> may have changed. |
| (Instead, the compiler could spill <code>i</code> to the stack.) |
| </p> |
| |
| <p> |
| Not allowing a single write to write multiple values also means not using |
| the memory where a local variable will be written as temporary storage before the write. |
| For example, a compiler must not use <code>*p</code> as temporary storage in this program: |
| </p> |
| |
| <pre> |
| *p = i + *p/2 |
| </pre> |
| |
| <p> |
| That is, it must not rewrite the program into this one: |
| </p> |
| |
| <pre> |
| *p /= 2 |
| *p += i |
| </pre> |
| |
| <p> |
| If <code>i</code> and <code>*p</code> start equal to 2, |
| the original code does <code>*p = 3</code>, |
| so a racing thread can read only 2 or 3 from <code>*p</code>. |
| The rewritten code does <code>*p = 1</code> and then <code>*p = 3</code>, |
| allowing a racing thread to read 1 as well. |
| </p> |
| |
| <p> |
| Note that all these optimizations are permitted in C/C++ compilers: |
| a Go compiler sharing a back end with a C/C++ compiler must take care |
| to disable optimizations that are invalid for Go. |
| </p> |
| |
| <p> |
| Note that the prohibition on introducing data races |
| does not apply if the compiler can prove that the races |
| do not affect correct execution on the target platform. |
| For example, on essentially all CPUs, it is valid to rewrite |
| </p> |
| |
| <pre> |
| n := 0 |
| for i := 0; i < m; i++ { |
| n += *shared |
| } |
| </pre> |
| |
| into: |
| |
| <pre> |
| n := 0 |
| local := *shared |
| for i := 0; i < m; i++ { |
| n += local |
| } |
| </pre> |
| |
| <p> |
| provided it can be proved that <code>*shared</code> will not fault on access, |
| because the potential added read will not affect any existing concurrent reads or writes. |
| On the other hand, the rewrite would not be valid in a source-to-source translator. |
| </p> |
| |
| <h2 id="conclusion">Conclusion</h2> |
| |
| <p> |
| Go programmers writing data-race-free programs can rely on |
| sequentially consistent execution of those programs, |
| just as in essentially all other modern programming languages. |
| </p> |
| |
| <p> |
| When it comes to programs with races, |
| both programmers and compilers should remember the advice: |
| don't be clever. |
| </p> |