This page is an addition to the Go Code Review Comments list. The goal of this list is to help to find concurrency-related bugs when reviewing Go code.
You may also read through this list just once to refresh your memory and to make sure you are aware of all these concurrency gotchas.
⚠️ This page is community authored and maintained. It includes information that is disputed and may be misleading or incorrect.
Insufficient synchronisation and race conditions
Load()
or Delete()
calls on a sync.Map
after Load()
is not a race condition? Testing
Scalability
sync.RWMutex
to protect very short operations?Time
time.Ticker
is stopped using defer tick.Stop()
?time.Time
using Equal()
, not ==
?time.Time
argument of time.Since()
?t.Before(u)
, the monotonic component is stripped from the argument?# RC.1. The HTTP handler function is safe to call concurrently from multiple goroutines? It's easy to overlook HTTP handlers should be thread-safe because they are usually not called explicitly from anywhere in the project code, but only from the internals of the HTTP server.
# RC.2. Is there some field or variable access not protected by a mutex where the field or the variable is a primitive or of a type that is not explicitly thread-safe (such as atomic.Value
), while this field can be updated from a concurrent goroutine? It's not safe to skip synchronising reads even to primitive variables because of non-atomic hardware writes and potential memory visibility problems.
See also a Typical Data Race: Primitive unprotected variable.
# RC.3. A method on a thread-safe type doesn't return a pointer to a protected structure? This is a subtle bug which leads to the unprotected access problem described in the previous item. Example:
type Counters struct { mu sync.Mutex vals map[Key]*Counter } func (c *Counters) Add(k Key, amount int) { c.mu.Lock() defer c.mu.Unlock() count, ok := c.vals[k] if !ok { count = &Counter{sum: 0, num: 0} c.vals[k] = count } count.sum += amount count.n += 1 } func (c *Counters) GetCounter(k Key) *Counter { c.mu.Lock() defer c.mu.Unlock() return c.vals[k] // BUG! Returns a pointer to the structure which must be protected }
One possible solution is to return a copy, not a pointer to the structure in GetCounter()
:
type Counters struct { mu sync.Mutex vals map[Key]Counter // Note that now we are storing the Counters directly, not pointers. } ... func (c *Counters) GetCounter(k Key) (count Counter, exists bool) { c.mu.Lock() defer c.mu.Unlock() return c.vals[k] }
# RC.4. If there is more than one goroutine that can update a sync.Map
, you don't call m.Store()
or m.Delete()
depending on the success of a previous m.Load()
call? In other words, the following code is racy:
var m sync.Map // Can be called concurrently from multiple goroutines func DoSomething(k Key, v Value) { existing, ok := m.Load(k) if !ok { m.Store(k, v) // RACE CONDITION - two goroutines can execute this in parallel ... some other logic, assuming the value in `k` is now `v` in the map } ... }
Such a race condition could be benign in some cases: for example, the logic between the Load()
and Store()
calls computes the value to be cached in the map, and this computation always returns the same result and doesn't have side effects.
⚠️ Potentially misleading information. “Race condition” can refer to logic errors, like this example, which can be benign. But the phrase is also commonly used to refer to violations of the memory model, which are never benign.
If the race condition is not benign, use methods sync.Map.LoadOrStore()
and LoadAndDelete()
to fix it.
# Sc.1. Is this intentional that a channel is created with zero capacity, like make(chan *Foo)
? A goroutine which sends a message to a zero-capacity channel is blocked until another goroutine receives this message. Omitting the capacity in the make()
call might be just a mistake which will limit the scalability of the code, and it‘s likely unit test won’t find such a bug.
⚠️ Misleading information. Buffered channels do not inherently increase “scalability” versus unbuffered channels. However, buffered channels can easily obscure deadlocks and other fundamental design errors that would be immediately apparent with unbuffered channels.
# Sc.2. Locking with RWMutex
incurs extra overhead compared to plain sync.Mutex
, and, furthermore, there might be some scalability issue with the current implementation of RWMutex
in Go. Unless the case is very clear (such as an RWMutex
used to synchronize many read-only operations each lasting hundreds of milliseconds or more, and the writes which require an exclusive lock seldom happen), there should be some benchmarks proving that RWMutex
indeed helps to improve the performance. A typical example where RWMutex
certainly does more harm than good is a simple protection of a variable in a struct:
type Box struct { mu sync.RWMutex // DON'T DO THIS -- use a simple Mutex instead. x int } func (b *Box) Get() int { b.mu.RLock() defer b.mu.RUnlock() return b.x } func (b *Box) Set(x int) { b.mu.Lock() defer b.mu.Unlock() b.x = x }
# Tm.1. Is time.Ticker
stopped using defer tick.Stop()
? It's a memory leak not to stop the ticker when the function which uses the ticker in a loop returns.
# Tm.2. Are time.Time
structs compared using Equal()
method, not just ==
? Quoting the documentation for time.Time
:
Note that the Go
==
operator compares not just the time instant but also the Location and the monotonic clock reading. Therefore,Time
values should not be used as map or database keys without first guaranteeing that the identical Location has been set for all values, which can be achieved through use of theUTC()
orLocal()
method, and that the monotonic clock reading has been stripped by settingt = t.Round(0)
. In general, prefert.Equal(u)
tot == u
, sincet.Equal()
uses the most accurate comparison available and correctly handles the case when only one of its arguments has a monotonic clock reading.
# Tm.3. Before calling time.Since(t)
, the monotonic component is not stripped from t
? This is a consequence of the previous item. If the monotonic component is stripped from the time.Time
structure before passing it into time.Since()
function (via calling either UTC()
, Local()
, In()
, Round()
, Truncate()
, or AddDate()
) the result of time.Since()
might be negative on very rare occasions, such as if the system time has been synced via NTP between the moment when the start time was originally obtained and the moment when time.Since()
is called. If the monotonic component is not stripped, time.Since()
will always return a positive duration.
# Tm.4. If you want to compare system times via t.Before(u)
, do you strip the monotonic component from the argument, e.g. via u.Round(0)
? This is another point related to Tm.2. Sometimes, you need to compare two time.Time
structs only by the system time stored in them, specifically. You may need this before storing one of these Time
structs on disk or sending them over the network. Imagine, for example, some sort of telemetry agent which pushes a telemetry metric together with time periodically to some remote system:
var latestSentTime time.Time func pushMetricPeriodically(ctx context.Context) { t := time.NewTicker(time.Second) defer t.Stop() for { select { case <-ctx.Done: return case <-t.C: newTime := time.Now().Round(0) // Strip monotonic component to compare system time only // Check that the new time is later to avoid messing up the telemetry if the system time // is set backwards on an NTP sync. if latestSentTime.Before(newTime) { sendOverNetwork(NewDataPoint(newTime, metric())) latestSentTime = newTime } } } }
This code would be wrong without calling Round(0)
, i. e. stripping the monotonic component.
Go Code Review Comments: a checklist for reviewing Go code, not concurrency-specific.
Go concurrency:
Concurrency, but not specific to Go: