_content/doc/gc-guide.html - website - Git at Google

 <!--{
 	"Title": "A Guide to the Go Garbage Collector",
 	"Path":  "/doc/gc-guide",
   	"Breadcrumb": true
 }-->

 <!--
 NOTE: In this document and others in this directory, the convention is to
 set fixed-width phrases with non-fixed-width spaces, as in
 <code>hello</code> <code>world</code>.
 Do not send CLs removing the interior tags from such phrases.
 -->

 <style>
 .gc-guide-graph {
   display: inline-block;
   position: relative;
   width: 100%;
   vertical-align: top;
   overflow: hidden;
 }

 .gc-guide-graph-controls {
   display: flex;
   flex-direction: row;
   flex-wrap: wrap;
   justify-content: space-around;
   align-items: center;
   width: 100%;
 }

 .gc-guide-graph-controls div {
   display: flex;
   flex-direction: row;
   flex-wrap: nowrap;
   align-items: center;
   padding-left: 5px;
   padding-right: 5px;
   height: 24px;
 }

 .gc-guide-counter {
   display: block;
   overflow-x: hidden; /* Prevent automatic resizing, which makes the input jittery. */
   width: 10em; /* Never contains more than 10 characters. */
 }

 .gc-guide-equation {
   display: block;
   text-align: center;
 }

 .gc-guide-note {
   margin-left: 3em;
 }
 </style>

 <h2 id="Introduction">Introduction</h2>

 <p>
 This guide is intended to aid advanced Go users in better understanding their
 application costs by providing insights into the Go garbage collector.
 It also provides guidance on how Go users may use these insights to improve
 their applications' resource utilization.
 It does not assume any knowledge of garbage collection, but does assume
 familiarity with the Go programming language.
 </p>

 <p>
 The Go language takes responsibility for arranging the storage of Go values;
 in most cases, a Go developer need not care about where these values are stored,
 or why, if at all.
 In practice, however, these values often need to be stored in computer
 <b>physical memory</b> and physical memory is a finite resource.
 Because it is finite, memory must be managed carefully and recycled in order to
 avoid running out of it while executing a Go program.
 It's the job of a Go implementation to allocate and recycle memory as needed.
 </p>

 <p>
 Another term for automatically recycling memory is <b>garbage collection</b>.
 At a high level, a garbage <i>collector</i> (or GC, for short) is a system that
 recycles memory on behalf of the application by identifying which parts of memory
 are no longer needed.
 The Go standard toolchain provides a runtime library that ships with every
 application, and this runtime library includes a garbage collector.
 </p>

 <p>
 Note that the existence of a garbage collector as described by this guide
 is not guaranteed by the <a href="/ref/spec">Go specification</a>, only that
 the underlying storage for Go values is managed by the language itself.
 This omission is intentional and enables the use of radically different
 memory management techniques.
 </p>

 <p>
 Therefore, this guide is about a specific implementation of the Go programming
 language and <i>may not apply to other implementations</i>.
 Specifically, this following guide applies to the standard toolchain (the
 <code>gc</code> Go compiler and tools).
 Gccgo and Gollvm both use a very similar GC implementation so many of the
 same concepts apply, but details may vary.
 </p>

 <p>
 Furthermore, this is a living document and will change over time to best
 reflect the latest release of Go.
 This document currently describes the garbage collector as of Go 1.19.
 </p>


 <h3 id="Where_Go_Values_Live">Where Go Values Live</h3>

 <p>
 Before we dive into the GC, let's first discuss the memory that doesn't need to
 be managed by the GC.
 </p>

 <p>
 For instance, non-pointer Go values stored in local variables will likely not be
 managed by the Go GC at all, and Go will instead arrange for memory to be
 allocated that's tied to the
 <a href="/ref/spec#Declarations_and_scope">lexical scope</a> in
 which it's created.
 In general, this is more efficient than relying on the GC, because the Go
 compiler is able to predetermine when that memory may be freed and emit
 machine instructions that clean up.
 Typically, we refer to allocating memory for Go values this way as
 "stack allocation," because the space is stored on the goroutine stack.
 </p>

 <p>
 Go values whose memory cannot be allocated this way, because the Go compiler
 cannot determine its lifetime, are said to <i>escape to the heap</i>.
 "The heap" can be thought of as a catch-all for memory allocation, for when Go
 values need to be placed <i>somewhere</i>.
 The act of allocating memory on the heap is typically referred to as "dynamic
 memory allocation" because both the compiler and the runtime can make very few
 assumptions as to how this memory is used and when it can be cleaned up.
 That's where a GC comes in: it's a system that specifically identifies and
 cleans up dynamic memory allocations.
 </p>

 <p>
 There are many reasons why a Go value might need to escape to the heap.
 One reason could be that its size is dynamically determined.
 Consider for instance the backing array of a slice whose initial size is
 determined by a variable, rather than a constant.
 Note that escaping to the heap must also be transitive: if a reference to a
 Go value is written into another Go value that has already been determined to
 escape, that value must also escape.
 </p>

 <p>
 Whether a Go value escapes or not is a function of the context in which it is
 used and the Go compiler's escape analysis algorithm.
 It would be fragile and difficult to try to enumerate precisely when values
 escape: the algorithm itself is fairly sophisticated and changes between Go
 releases.
 For more details on how to identify which values escape and which do not, see
 the section on
 <a href="#Eliminating_heap_allocations">eliminating heap allocations</a>.
 </p>

 <h3 id="Tracing_Garbage_Collection">Tracing Garbage Collection</h3>

 <p>
 Garbage collection may refer to many different methods of automatically
 recycling memory; for example, reference counting.
 In the context of this document, garbage collection refers to <b>tracing</b>
 garbage collection, which identifies in-use, so-called <b>live</b>, objects by
 following pointers transitively.
 </p>

 <p>
 Let's define these terms more rigorously.
 </p>

 <ul>
 	<li>
 		<p>
 		<b>Object</b>&mdash;An object is a dynamically allocated piece of memory
 		that contains one or more Go values.
 		</p>
 	</li>
 	<li>
 		<p>
 		<b>Pointer</b>&mdash;A memory address that references any value within an
 		object.
 		This naturally includes Go values of the form <code>*T</code>, but also includes
 		parts of built-in Go values.
 		Strings, slices, channels, maps, and interface values all contain memory
 		addresses that the GC must trace.
 		</p>
 	</li>
 </ul>

 <p>
 Together, objects and pointers to other objects form the <b>object graph</b>.
 To identify live memory, the GC walks the object graph starting at the
 program's <b>roots</b>, pointers that identify objects that are definitely
 in-use by the program.
 Two examples of roots are local variables and global variables.
 The process of walking the object graph is referred to as <b>scanning</b>.
 Another phrase you might see in the Go documentation is whether an object is
 <b>reachable</b>, which just means that the object can be discovered by the
 scanning process.
 Note also that, <a href="#Finalizers_cleanups_and_weak_pointers">with one
 exception</a>, once memory becomes unreachable, it stays unreachable.
 </p>

 <p>
 This basic algorithm is common to all tracing GCs.
 Where tracing GCs differ is what they do once they discover memory is live.
 Go's GC uses the mark-sweep technique, which means that in order to keep track
 of its progress, the GC also <b>marks</b> the values it encounters as live.
 Once tracing is complete, the GC then walks over all memory in the heap and
 makes all memory that is <i>not</i> marked available for allocation.
 This process is called <b>sweeping</b>.
 </p>

 <p>
 One alternative technique you may be familiar with is to actually <i>move</i>
 the objects to a new part of memory and leave behind a forwarding pointer that
 is later used to update all the application's pointers.
 We call a GC that moves objects in this way a <b>moving</b> GC; Go has a
 <b>non-moving</b> GC.
 </p>

 <h2 id="The_GC_cycle">The GC cycle</h2>

 <p>
 Because the Go GC is a mark-sweep GC, it broadly operates in two phases: the
 mark phase, and the sweep phase.
 While this statement might seem tautological, it contains an important insight:
 it's not possible to release memory back to be allocated until <i>all</i> memory
 has been traced, because there may still be an un-scanned pointer keeping
 an object alive.
 As a result, the act of sweeping must be entirely separated from the act of
 marking.
 Furthermore, the GC may also not be active at all, when there's no GC-related
 work to do.
 The GC continuously rotates through these three phases of sweeping, off, and
 marking in what's known as the <b>GC cycle</b>.
 For the purposes of this document, consider the GC cycle starting with sweeping,
 turning off, then marking.
 </p>

 <p>
 The next few sections will focus on building intuition for the costs of the
 GC to aid users in tweaking GC parameters for their own benefit.
 </p>

 <h3 id="Understanding_costs">Understanding costs</h3>

 <p>
 The GC is inherently a complex piece of software built on even more complex
 systems.
 It's easy to become mired in detail when trying to understand the GC and
 tweak its behavior.
 This section is intended to provide a framework for reasoning about the cost
 of the Go GC and its tuning parameters.
 </p>

 <p>
 To begin with, consider this model of GC cost based on three simple axioms.
 </p>

 <ol>
 	<li>
 		<p>
 		The GC involves only two resources: physical memory, and CPU time.
 		</p>
 	</li>
 	<li>
 		<p>
 		The GC's memory costs consist of live heap memory, new heap memory
 		allocated before the mark phase, and space for metadata that, even
 		if proportional to the previous costs, are small in comparison.
 		</p>
 		<p class="gc-guide-equation">
 		<i>
 		GC memory cost for cycle N = live heap from cycle N-1 + new heap
 		</i>
 		</p>
 		<p>
 		Live heap memory is memory that was determined to be live by the
 		previous GC cycle, while new heap memory is any memory allocated in the
 		current cycle, which may or may not be live by the end.
 		How much memory is live at any given point in time is a property of the
 		program, and not something the GC can directly control.
 		</p>
 	</li>
 	<li>
 		<p>
 		The GC's CPU costs are modeled as a fixed cost per cycle, and a
 		marginal cost that scales proportionally with the size of the live
 		heap.
 		</p>
 		<p class="gc-guide-equation">
 		<i>
 		GC CPU time for cycle N = Fixed CPU time cost per cycle + average CPU time cost per byte * live heap memory found in cycle N
 		</i>
 		</p>
 		<p>
 		The fixed CPU time cost per cycle includes things that happen a constant number
 		of times each cycle, like initializing data structures for the next GC cycle.
 		This cost is typically small, and is included just for completeness.
 		</p>
 		<p>
 		Most of the CPU cost of the GC is marking and scanning, which is captured by
 		the marginal cost.
 		The average cost of marking and scanning depends on the GC implementation, but
 		also on the behavior of the program.
 		For example, more pointers means more GC work, because at minimum the GC needs
 		to visit all the pointers in the program.
 		Structures like linked lists and trees are also more difficult for the GC to
 		walk in parallel, increasing the average cost per byte.
 		</p>
 		<p>
 		This model ignores sweeping costs, which are proportional to total heap memory,
 		including memory that is dead (it must be made available for allocation).
 		For Go's current GC implementation, sweeping is so much faster than marking and
 		scanning that the cost is negligible in comparison.
 		</p>
 	</li>
 </ol>

 <p>
 This model is simple but effective: it accurately categorizes the dominant
 costs of the GC.
 It also tells us that the <i>total CPU cost</i> of the garbage collector depends on
 the total number of GC cycles in a given time frame.
 Finally, embedded in this model is a fundamental time/space trade-off for the GC.
 </p>

 <p>
 To see why, let's explore a constrained but useful scenario: the
 <b>steady state</b>.
 The steady state of an application, from the GC's perspective, is defined by the
 following properties:
 </p>

 <ul>
 	<li>
 		<p>
 		The rate at which the application allocates new memory (in bytes per
 		second) is constant.
 		</p>
 		<p>
 		This means that, from the GC's perspective, the application's workload
 		looks approximately the same over time.
 		For example, for a web service, this would be a constant request rate
 		with, on average, the same kinds of requests being made, with the average
 		lifetime of each request staying roughly constant.
 		</p>
 	</li>
 	<li>
 		<p>
 		The marginal costs of the GC are constant.
 		</p>
 		<p>
 		This means that statistics of the object graph, such as the distribution
 		of object sizes, the number of pointers, and the average depth of data
 		structures, remain the same from cycle to cycle.
 		</p>
 	</li>
 </ul>

 <p>
 Let's work through an example.
 Assume some application is operating in a steady-state, allocating 10 MiB/s,
 while the GC can scan memory at a rate of 100 MiB/cpu-second (this is made up).
 The steady state makes no assumptions about the size of the live heap,
 but for simplicity, let's say this application's live heap is always 10 MiB.
 Let's also assume, again, for simplicity, that the fixed GC costs are zero.
 Let's play around with the GC cycle period.
 </p>

 <p>
 Suppose each GC cycle happens after exactly 1 cpu-second.
 Then, by the end of each GC cycle our example application will have allocated
 10 MiB of additional memory, resulting in a 20 MiB total heap size.
 And with every GC cycle, the GC will spend 0.1 cpu-seconds scanning the
 10 MiB live heap, resulting in a 10% CPU overhead.
 Recall that the GC only needs to walk the live heap, not the whole heap.
 (Note: a constant live heap does not mean that all newly allocated memory is
 dead.
 It means that, after the GC runs, <i>some mix</i> of old and new heap memory
 dies, and only that the end result is 10 MiB found live each cycle.)
 </p>

 <p>
 Now suppose each GC cycle happens less often, once every 2 cpu-seconds.
 Then, our example application, in the steady state, will have a 30 MiB total
 heap size on each GC cycle, since it'll allocate 20 MiB in that time.
 But with every GC cycle, the GC will <i>still only need 0.1 cpu-seconds</i>
 to scan the 10 MiB of live memory.
 Again, we're assuming the live heap size stays the same, regardless of how
 much memory is allocated.
 So this means that our GC overhead just went down, from 10% to 5%, at the
 cost of 50% more memory being used.
 </p>

 <p>
 This change in overheads is the fundamental time/space trade-off mentioned
 earlier.
 And <b>GC frequency</b> is at the center of this trade-off:
 if we execute the GC more frequently, then we use less memory, and vice versa.
 But how often does the GC actually execute?
 In Go, deciding when the GC should start is the main parameter which the user
 has control over.
 </p>

 <h3 id="GOGC">GOGC</h3>

 <p>
 At a high level, GOGC determines the trade-off between GC CPU and memory.
 </p>

 <p>
 It works by determining the target heap size after each GC cycle, a target value
 for the total heap size in the next cycle.
 The GC's goal is to finish a collection cycle before the total heap size
 exceeds the target heap size.
 Total heap size is defined as the live heap size at the end of the previous
 cycle, plus any new heap memory allocated by the application since the previous
 cycle.
 Meanwhile, target heap memory is defined as:
 </p>

 <p class="gc-guide-equation">
 <i>
 Target heap memory = Live heap + (Live heap + GC roots) * GOGC / 100
 </i>
 </p>

 <p>
 As an example, consider a Go program with a live heap size of 8 MiB, 1 MiB
 of goroutine stacks, and 1 MiB of pointers in global variables.
 Then, with a GOGC value of 100, the amount of new memory that will be allocated
 before the next GC runs will be 10 MiB, or 100% of the 10 MiB of work, for a
 total heap footprint of 18 MiB.
 With a GOGC value of 50, then it'll be 50%, or 5 MiB.
 With a GOGC value of 200, it'll be 200%, or 20 MiB.
 </p>

 <p class="gc-guide-note">
 Note: GOGC includes the root set only as of Go 1.18.
 Previously, it would only count the live heap.
 Often, the amount of memory in goroutine stacks is quite small and the live
 heap size dominates all other sources of GC work, but in cases where programs
 had hundreds of thousands of goroutines, the GC was making poor judgements.
 </p>

 <p>
 The heap target controls GC frequency: the bigger the target, the longer the GC
 can wait to start another mark phase and vice versa.
 While the precise formula is useful for making estimates, it's best to think of
 GOGC in terms of its fundamental purpose: a parameter that picks a point in the
 GC CPU and memory trade-off.
 The key takeaway is that <b>doubling GOGC will double heap memory overheads and
 roughly halve GC CPU cost</b>, and vice versa.
 (To see a full explanation as to why, see the
 <a href="#Additional_notes_on_GOGC">appendix</a>.)
 </p>

 <p class="gc-guide-note">
 Note: the target heap size is just a target, and there are several reasons why
 the GC cycle might not finish right at that target.
 For one, a large enough heap allocation can simply exceed the target.
 However, other reasons appear in GC implementations that go beyond the
 <a href="#Understanding_costs">GC model</a> this guide has been using thus far.
 For some more detail, see the <a href="#Latency">latency section</a>, but the
 complete details may be found in the <a href="#Additional_resources">additional
 resources</a>.
 </p>

 <p>
 GOGC may be configured through either the <code>GOGC</code> environment
 variable (which all Go programs recognize), or through the
 <a href="https://pkg.go.dev/runtime/debug#SetGCPercent"><code>SetGCPercent</code></a>
 API in the <code>runtime/debug</code> package.
 </p>

 <p>
 Note that GOGC may also be used to turn off the GC entirely (provided the
 <a href="#Memory_limit">memory limit</a> does not apply) by setting
 <code>GOGC=off</code> or calling <code>SetGCPercent(-1)</code>.
 Conceptually, this setting is equivalent to setting GOGC to a value of
 infinity, as the amount of new memory before a GC is triggered is unbounded.
 </p>

 <p>
 To better understand everything we've discussed so far, try out the interactive
 visualization below that is built on the
 <a href="#Understanding_costs">GC cost model</a> discussed earlier.
 This visualization depicts the execution of some program whose non-GC work takes
 10 seconds of CPU time to complete.
 In the first second it performs some initialization step (growing its live heap)
 before settling into a steady state.
 The application allocates 200 MiB in total, with 20 MiB live at a time.
 It assumes that the only relevant GC work to complete comes from the live heap,
 and that (unrealistically) the application uses no additional memory.
 </p>

 <p>
 Use the slider to adjust the value of GOGC to see how the application responds
 in terms of total duration and GC overhead.
 Each GC cycle ends while the new heap drops to zero.
 The time taken while the new heap drops to zero is the combined time for the
 mark phase for cycle N, and the sweep phase for the cycle N+1.
 Note that this visualization (and all the visualizations in this guide) assume
 the application is paused while the GC executes, so GC CPU costs are fully
 represented by the time it takes for new heap memory to drop to zero.
 This is only to make visualization simpler; the same intuition still applies.
 The X axis shifts to always show the full CPU-time duration of the program.
 Notice that additional CPU time used by the GC increases the overall duration.
 </p>

 <div class="gc-guide-graph" data-workload='[
 	{"duration": 1.0, "allocRate": 20, "scanRate": 1024, "newSurvivalRate": 1.00, "oldDeathRate": 0.00},
 	{"duration": 9.0, "allocRate": 20, "scanRate": 1024, "newSurvivalRate": 0.00, "oldDeathRate": 0.00}
 ]' data-config='{
 	"fixedCost": 0.04,
 	"otherMem": 0,
 	"GOGC": "graph1-gogc",
 	"memoryLimit": 100000
 }'></div>
 <div class="gc-guide-graph-controls">
 	<div>
 		GOGC
 		<input type="range" min="0" max="10" step="0.005" value="6.64" id="graph1-gogc">
 		<div class="gc-guide-counter" id="graph1-gogc-display"></div>
 	</div>
 </div>

 <p>
 Notice that the GC always incurs some CPU and peak memory overhead.
 As GOGC increases, CPU overhead decreases, but peak memory increases
 proportionally to the live heap size.
 As GOGC decreases, the peak memory requirement decreases at the expense of
 additional CPU overhead.
 </p>

 <p class="gc-guide-note">
 Note: the graph displays CPU time, not wall-clock time to complete the program.
 If the program runs on 1 CPU and fully utilizes its resources, then these are
 equivalent.
 A real-world program likely runs on a multi-core system and does not 100%
 utilize the CPUs at all times.
 In these cases the wall-time impact of the GC will be lower.
 </p>

 <p class="gc-guide-note">
 Note: the Go GC has a minimum total heap size of 4 MiB, so if the GOGC-set
 target is ever below that, it gets rounded up.
 The visualization reflects this detail.
 </p>

 <p>
 Here's another example that's a little bit more dynamic and realistic.
 Once again, the application takes 10 CPU-seconds to complete without the GC, but
 the steady state allocation rate increases dramatically half-way through, and
 the live heap size shifts around a bit in the first phase.
 This example demonstrates how the steady state might look when the live heap
 size is actually changing, and how a higher allocation rate leads to more
 frequent GC cycles.
 </p>

 <div class="gc-guide-graph" data-workload='[
 	{"duration": 1.0, "allocRate": 20, "scanRate": 1024, "newSurvivalRate": 1.00, "oldDeathRate": 0.00},
 	{"duration": 1.0, "allocRate": 20, "scanRate": 1024, "newSurvivalRate": 0.00, "oldDeathRate": 0.50},
 	{"duration": 1.0, "allocRate": 20, "scanRate": 1024, "newSurvivalRate": 0.50, "oldDeathRate": 0.00},
 	{"duration": 1.0, "allocRate": 20, "scanRate": 1024, "newSurvivalRate": 0.00, "oldDeathRate": 0.50},
 	{"duration": 1.0, "allocRate": 20, "scanRate": 1024, "newSurvivalRate": 0.50, "oldDeathRate": 0.00},
 	{"duration": 5.0, "allocRate": 200, "scanRate": 1024, "newSurvivalRate": 0.02, "oldDeathRate": 1.00}
 ]' data-config='{
 	"fixedCost": 0.04,
 	"otherMem": 0,
 	"GOGC": "graph2-gogc",
 	"memoryLimit": 100000
 }'></div>
 <div class="gc-guide-graph-controls">
 	<div>
 		GOGC
 		<input type="range" min="0" max="10" step="0.005" value="6.64" id="graph2-gogc">
 		<div class="gc-guide-counter" id="graph2-gogc-display"></div>
 	</div>
 </div>

 <h3 id="Memory_limit">Memory limit</h3>

 <p>
 Until Go 1.19, GOGC was the sole parameter that could be used to modify the GC's
 behavior.
 While it works great as a way to set a trade-off, it doesn't take into account
 that available memory is finite.
 Consider what happens when there's a transient spike in the live heap size:
 because the GC will pick a total heap size proportional to that live heap size,
 GOGC must be configured such for the <i>peak</i> live heap size, even if in the
 usual case a higher GOGC value provides a better trade-off.
 </p>

 <p>
 The visualization below demonstrates this transient heap spike situation.
 </p>

 <div class="gc-guide-graph" data-workload='[
 	{"duration": 1.0, "allocRate": 20, "scanRate": 1024, "newSurvivalRate": 1.00, "oldDeathRate": 0.00},
 	{"duration": 4.0, "allocRate": 20, "scanRate": 1024, "newSurvivalRate": 0.00, "oldDeathRate": 0.00},
 	{"duration": 0.5, "allocRate": 20, "scanRate": 1024, "newSurvivalRate": 1.00, "oldDeathRate": 0.00},
 	{"duration": 0.5, "allocRate": 20, "scanRate": 1024, "newSurvivalRate": 0.00, "oldDeathRate": 0.00},
 	{"duration": 0.5, "allocRate": 20, "scanRate": 1024, "newSurvivalRate": 0.00, "oldDeathRate": 0.50},
 	{"duration": 3.5, "allocRate": 20, "scanRate": 1024, "newSurvivalRate": 0.00, "oldDeathRate": 0.00}
 ]' data-config='{
 	"fixedCost": 0.04,
 	"otherMem": 0,
 	"GOGC": "graph3-gogc",
 	"memoryLimit": 100000
 }'></div>
 <div class="gc-guide-graph-controls">
 	<div>
 		GOGC
 		<input type="range" min="0" max="10" step="0.005" value="6.64" id="graph3-gogc">
 		<div class="gc-guide-counter" id="graph3-gogc-display"></div>
 	</div>
 </div>

 <p>
 If the example workload is running in a container with a bit over 60 MiB of
 memory available, then GOGC can't be increased beyond 100, even though the rest
 of the GC cycles have the available memory to make use of that extra memory.
 Furthermore, in some applications, these transient peaks can be rare and hard to
 predict, leading to occasional, unavoidable, and potentially costly
 out-of-memory conditions.
 </p>

 <p>
 That's why in the 1.19 release, Go added support for setting a runtime memory
 limit.
 The memory limit may be configured either via the <code>GOMEMLIMIT</code>
 environment variable which all Go programs recognize, or through the
 <code>SetMemoryLimit</code> function available in the <code>runtime/debug</code>
 package.
 </p>

 <p>
 This memory limit sets a maximum on the <i>total amount of memory that the Go
 runtime can use</i>.
 The specific set of memory included is defined in terms of
 <a href="https://pkg.go.dev/runtime#MemStats"><code>runtime.MemStats</code></a>
 as the expression
 </p>

 <p>
 <code>Sys</code> <code>-</code> <code>HeapReleased</code>
 </p>

 <p>
 or equivalently in terms of the
 <a href="https://pkg.go.dev/runtime/metrics"><code>runtime/metrics</code></a>
 package,
 </p>

 <p>
 <code>/memory/classes/total:bytes</code> <code>-</code> <code>/memory/classes/heap/released:bytes</code>
 </p>

 <p>
 Because the Go GC has explicit control over how much heap memory it uses, it
 sets the total heap size based on this memory limit and how much other memory
 the Go runtime uses.
 </p>

 <p>
 The visualization below depicts the same single-phase steady state workload from
 the GOGC section, but this time with an extra 10 MiB of overhead from the Go
 runtime and with an adjustable memory limit.
 Try shifting around both GOGC and the memory limit and see what happens.
 </p>

 <div class="gc-guide-graph" data-workload='[
 	{"duration": 1.0, "allocRate": 20, "scanRate": 1024, "newSurvivalRate": 1.00, "oldDeathRate": 0},
 	{"duration": 9.0, "allocRate": 20, "scanRate": 1024, "newSurvivalRate": 0.00, "oldDeathRate": 0}
 ]' data-config='{
 	"fixedCost": 0.04,
 	"otherMem": 10,
 	"GOGC": "graph4-gogc",
 	"memoryLimit": "graph4-memlimit"
 }'></div>
 <div class="gc-guide-graph-controls">
 	<div>
 		GOGC
 		<input type="range" min="0" max="10" step="0.005" value="6.64" id="graph4-gogc">
 		<div class="gc-guide-counter" id="graph4-gogc-display"></div>
 	</div>
 	<div>
 		Memory Limit
 		<input type="range" min="1" max="100" step="0.5" value="100" id="graph4-memlimit">
 		<div class="gc-guide-counter" id="graph4-memlimit-display"></div>
 	</div>
 </div>

 <p>
 Notice that when the memory limit is lowered below the peak memory that's
 determined by GOGC (42 MiB for a GOGC of 100), the GC runs more frequently to
 keep the peak memory within the limit.
 </p>

 <p>
 Returning to our previous example of the transient heap spike, by setting a
 memory limit and turning up GOGC, we can get the best of both worlds: no memory
 limit breach, and better resource economy.
 Try out the interactive visualization below.
 </p>

 <div class="gc-guide-graph" data-workload='[
 	{"duration": 1.0, "allocRate": 20, "scanRate": 1024, "newSurvivalRate": 1.00, "oldDeathRate": 0.00},
 	{"duration": 4.0, "allocRate": 20, "scanRate": 1024, "newSurvivalRate": 0.00, "oldDeathRate": 0.00},
 	{"duration": 0.5, "allocRate": 20, "scanRate": 1024, "newSurvivalRate": 1.00, "oldDeathRate": 0.00},
 	{"duration": 0.5, "allocRate": 20, "scanRate": 1024, "newSurvivalRate": 0.00, "oldDeathRate": 0.00},
 	{"duration": 0.5, "allocRate": 20, "scanRate": 1024, "newSurvivalRate": 0.00, "oldDeathRate": 0.50},
 	{"duration": 3.5, "allocRate": 20, "scanRate": 1024, "newSurvivalRate": 0.00, "oldDeathRate": 0.00}
 ]' data-config='{
 	"fixedCost": 0.04,
 	"otherMem": 0,
 	"GOGC": "graph5-gogc",
 	"memoryLimit": "graph5-memlimit"
 }'></div>
 <div class="gc-guide-graph-controls">
 	<div>
 		GOGC
 		<input type="range" min="0" max="10" step="0.005" value="6.64" id="graph5-gogc">
 		<div class="gc-guide-counter" id="graph5-gogc-display"></div>
 	</div>
 	<div>
 		Memory Limit
 		<input type="range" min="1" max="100" step="0.5" value="100" id="graph5-memlimit">
 		<div class="gc-guide-counter" id="graph5-memlimit-display"></div>
 	</div>
 </div>

 <p>
 Notice that with some values of GOGC and the memory limit, peak memory use
 stops at whatever the memory limit is, but that the rest of the program's
 execution still obeys the total heap size rule set by GOGC.
 </p>

 <p>
 This observation leads to another interesting detail: even when GOGC is set to
 off, the memory limit is still respected!
 In fact, this particular configuration represents a <i>maximization of resource
 economy</i> because it sets the minimum GC frequency required to maintain some
 memory limit.
 In this case, <i>all</i> of the program's execution has the heap size rise to
 meet the memory limit.
 </p>

 <p>
 Now, while the memory limit is clearly a powerful tool, <b>the use of
 a memory limit does not come without a cost</b>, and certainly doesn't
 invalidate the utility of GOGC.
 </p>

 <p>
 Consider what happens when the live heap grows large enough to bring total
 memory use close to the memory limit.
 In the steady state visualization above, try turning GOGC off and then slowly
 lowering the memory limit further and further to see what happens.
 Notice that the total time the application takes will start to grow in an
 unbounded manner as the GC is constantly executing to maintain an impossible
 memory limit.
 </p>

 <p>
 This situation, where the program fails to make reasonable progress due to
 constant GC cycles, is called <b>thrashing</b>.
 It's particularly dangerous because it effectively stalls the program.
 Even worse, it can happen for exactly the same situation we were trying to
 avoid with GOGC: a large enough transient heap spike can cause a program to
 stall indefinitely!
 Try reducing the memory limit (around 30 MiB or lower) in the transient heap
 spike visualization and notice how the worst behavior specifically starts with
 the heap spike.
 </p>

 <p>
 In many cases, an indefinite stall is worse than an out-of-memory condition,
 which tends to result in a much faster failure.
 </p>

 <p>
 For this reason, the memory limit is defined to be <b>soft</b>.
 The Go runtime makes no guarantees that it will maintain this memory limit
 under all circumstances; it only promises some reasonable amount of effort.
 This relaxation of the memory limit is critical to avoiding thrashing behavior,
 because it gives the GC a way out: let memory use surpass the limit to avoid
 spending too much time in the GC.
 </p>

 <p>
 How this works internally is the GC sets an upper limit on the amount
 of CPU time it can use over some time window (with some hysteresis for very
 short transient spikes in CPU use).
 This limit is currently set at roughly 50%, with a <code>2 * GOMAXPROCS</code>
 CPU-second window.
 The consequence of limiting GC CPU time is that the GC's work is delayed,
 meanwhile the Go program may continue allocating new heap memory, even beyond
 the memory limit.
 </p>

 <p>
 The intuition behind the 50% GC CPU limit is based on the worst-case impact
 on a program with ample available memory.
 In the case of a misconfiguration of the memory limit, where it is set too
 low mistakenly, the program will slow down at most by 2x, because the GC
 can't take more than 50% of its CPU time away.
 </p>

 <p class="gc-guide-note">
 Note: the visualizations on this page do not simulate the GC CPU limit.
 </p>

 <h4 id="Suggested_uses">Suggested uses</h4>

 <p>
 While the memory limit is a powerful tool, and the Go runtime takes steps to
 mitigate the worst behaviors from misuse, it's still important to use it
 thoughtfully.
 Below is a collection of tidbits of advice about where the memory limit is most
 useful and applicable, and where it might cause more harm than good.
 </p>

 <ul>
 	<li>
 		<p>
 		<b>Do</b> take advantage of the memory limit when the execution
 		environment of your Go program is entirely within your control, and
 		the Go program is the only program with access to some set of resources
 		(i.e. some kind of memory reservation, like a container memory limit).
 		</p>
 		<p>
 		A good example is the deployment of a web service into containers with
 		a fixed amount of available memory.
 		</p>
 		<p>
 		<b>In this case, a good rule of thumb is to leave an additional 5-10%
 		of headroom to account for memory sources the Go runtime is unaware of.
 		</b>
 		</p>
 	</li>
 	<li>
 		<p>
 		<b>Do</b> feel free to adjust the memory limit in real time to adapt to
 		changing conditions.
 		</p>
 		<p>
 		A good example is a cgo program where C libraries temporarily need to
 		use substantially more memory.
 		</p>
 	</li>
 	<li>
 		<p>
 		<b>Don't</b> set GOGC to off with a memory limit if the Go program
 		might share some of its limited memory with other programs, and those
 		programs are generally decoupled from the Go program.
 		Instead, keep the memory limit since it may help to curb undesirable
 		transient behavior, but set GOGC to some smaller, reasonable value for
 		the average case.
 		</p>
 		<p>
 		While it may be tempting to try and "reserve" memory for co-tenant
 		programs, unless the programs are fully synchronized (e.g. the Go
 		program calls some subprocess and blocks while its callee executes),
 		the result will be less reliable as inevitably both programs will
 		need more memory.
 		Letting the Go program use less memory when it doesn't need it will
 		generate a more reliable result overall.
 		This advice also applies to overcommit situations, where the sum of
 		memory limits of containers running on one machine may exceed the
 		actual physical memory available to the machine.
 		</p>
 	</li>
 	<li>
 		<p>
 		<b>Don't</b> use the memory limit when deploying to an execution
 		environment you don't control, especially when your program's memory
 		use is proportional to its inputs.
 		</p>
 		<p>
 		A good example is a CLI tool or a desktop application.
 		Baking a memory limit into the program when it's unclear what kind of
 		inputs it might be fed, or how much memory might be available on the
 		system can lead to confusing crashes and poor performance.
 		Plus, an advanced end-user can always set a memory limit if they wish.
 		</p>
 	</li>
 	<li>
 		<p>
 		<b>Don't</b> set a memory limit to avoid out-of-memory conditions when
 		a program is already close to its environment's memory limits.
 		</p>
 		<p>
 		This effectively replaces an out-of-memory risk with a risk of
 		severe application slowdown, which is often not a favorable trade,
 		even with the efforts Go makes to mitigate thrashing.
 		In such a case, it would be much more effective to either increase the
 		environment's memory limits (and <i>then</i> potentially set a memory
 		limit) or decrease GOGC (which provides a much cleaner trade-off than
 		thrashing-mitigation does).
 		</p>
 	</li>
 </ul>

 <h3 id="Latency">Latency</h3>

 <p>
 The visualizations in this document have modeled the application as paused while
 the GC is executing.
 GC implementations do exist that behave this way, and they're referred to as
 "stop-the-world" GCs.
 </p>

 <p>
 The Go GC, however, is not fully stop-the-world and does most of its work
 concurrently with the application.
 This is primarily to reduce application <i>latencies</i>.
 Specifically, the end-to-end duration of a single unit of computation (e.g. a
 web request).
 Thus far, this document mainly considered application <i>throughput</i> (e.g.
 web requests handled per second).
 Note that each example in the <a href="#The_GC_cycle">GC cycle</a> section
 focused on the total CPU duration of an executing program.
 However, such a duration is far less meaningful for say, a web service.
 While throughput is still important for a web service (i.e. queries per second),
 often the latency of each individual request matters even more.
 </p>

 <p>
 In terms of latency, a stop-the-world GC may require a considerable length of
 time to execute both its mark and sweep phases, during which the application,
 and in the context of a web service, any in-flight request, is unable to make
 further progress.
 Instead, the Go GC avoids making the length of any global application pauses
 proportional to the size of the heap, and that the core tracing algorithm is
 performed while the application is actively executing.
 (The pauses are more strongly proportional to GOMAXPROCS algorithmically, but
 most commonly are dominated by the time it takes to stop running goroutines.)
 Collecting concurrently is not without cost: in practice it often leads to a
 design with lower throughput than an equivalent stop-the-world garbage
 collector.
 However,  it's important to note that <i>lower latency does not inherently mean
 lower throughput</i>, and the performance of the Go garbage collector has
 steadily improved over time, in both latency and throughput.
 </p>

 <p>
 The concurrent nature of Go's current GC does not invalidate anything discussed
 in this document so far: none of the statements relied on this design choice.
 GC frequency is still the primary way the GC trades off between CPU
 time and memory for throughput, and in fact, it also takes on this role for
 latency.
 This is because most of the costs for the GC are incurred while the mark phase
 is active.
 </p>

 <p>
 The key takeaway then, is that <b>reducing GC frequency may also lead to latency
 improvements</b>.
 This applies not only to reductions in GC frequency from modifying tuning
 parameters, like increasing GOGC and/or the memory limit, but also applies to
 the optimizations described in the
 <a href="#Optimization_guide">optimization guide</a>.
 </p>

 <p>
 However, latency is often more complex to understand than throughput, because it
 is a product of the moment-to-moment execution of the program and not just an
 aggregation of costs.
 As a result, the connection between latency and GC frequency is less direct.
 Below is a list of possible sources of latency for those inclined to dig
 deeper.
 </p>

 <ol>
 	<li>
 		Brief stop-the-world pauses when the GC transitions between the mark
 		and sweep phases,
 	</li>
 	<li>
 		Scheduling delays because the GC takes 25% of CPU resources when in the
 		mark phase,
 	</li>
 	<li>
 		User goroutines assisting the GC in response to a high allocation rate,
 	</li>
 	<li>
 		Pointer writes requiring additional work while the GC is in the mark
 		phase, and
 	</li>
 	<li>
 		Running goroutines must be suspended for their roots to be scanned.
 	</li>
 </ol>

 <p>
 These latency sources are visible in
 <a href="/doc/diagnostics#execution-tracer">execution traces</a>, except for
 pointer writes requiring additional work.
 </p>

 <h3 id="Finalizers_cleanups_and_weak_pointers">Finalizers, cleanups, and weak pointers</h2>

 <p>
 Garbage collection provides the illusion of infinite memory using only finite
 memory.
 Memory is allocated but never explicitly freed, which enables simpler APIs and
 concurrent algorithms compared to bare-bones manual memory management.
 (Some languages with manually managed memory use alternative approaches such
 as "smart pointers" and compile-time ownership tracking to ensure that objects
 are freed, but these features are deeply embedded into the API design
 conventions in these languages.)
 </p>

 <p>
 Only the live objects&mdash;those reachable from a global variable or a
 computation in some goroutine&mdash;can affect the behavior of the program.
 Any time after an object becomes unreachable ("dead"), it may be safely
 recycled by the GC.
 This allows for a wide variety of GC designs, such as the tracing design used
 by Go today.
 The death of an object is not an observable event at the language level.
 </p>

 <p>
 However, Go's runtime library provides three features that break that illusion:
 <a href="/pkg/runtime#AddCleanup">cleanups</a>,
 <a href="/pkg/weak#Pointer">weak pointers</a>, and
 <a href="/pkg/runtime#SetFinalizer">finalizers</a>.
 Each of these features provides some way to observe and react to object death,
 and in the case of finalizers, even reverse it.
 This of course complicates Go programs and adds an additional burden to the GC
 implementation.
 Nonetheless, these features exist because they are useful in a variety of
 circumstances, and Go programs use them and benefit from them all the time.
 </p>

 <p>
 For the details of each feature, refer to its package documentation
 (<a href="/pkg/runtime#AddCleanup">runtime.AddCleanup</a>,
 <a href="/pkg/weak#Pointer">weak.Pointer</a>,
 <a href="/pkg/runtime#SetFinalizer">runtime.SetFinalizer</a>).
 Below is some general advice for using these features, outlines of common
 issues you can run into with each feature, and advice for testing uses of these
 features.
 </p>

 <h4 id="General_advice">General advice</h4>

 <ul>
 	<li>
 		<p>
 		<b>
 		Write unit tests.
 		</b>
 		</p>
 		<p>
 		The exact timing of cleanups, weak pointers, and finalizers can be difficult
 		to predict, and it's easy to convince yourself that everything works, even
 		after many consecutive executions.
 		But it's also easy to make subtle mistakes.
 		<a href="#Testing_object_death">Writing tests</a> for them can be tricky, but
 		given that they're so subtle to use, testing is even more important usual.
 		</p>
 	</li>
 	<li>
 		<p>
 		<b>
 		Avoid using these features directly in typical Go code.
 		</b>
 		</p>
 		<p>
 		These are low-level features with subtle restrictions and behaviors.
 		For instance, there's no guarantee cleanups or finalizers will be run
 		at program exit, or at all for that matter.
 		The long comments in their API documentation should be seen as a warning.
 		The vast majority of Go code does not benefit from using these features
 		directly, only indirectly.
 		</p>
 	</li>
 	<li>
 		<p>
 		<b>
 		Encapsulate the use of these mechanisms within a package.
 		</b>
 		</p>
 		<p>
 		Where possible, do not allow the use of these mechanisms to leak into
 		the public API of your package; provide interfaces that make it hard or
 		impossible for users to misuse them.
 		For example, instead of asking the user to set up a cleanup on some
 		C-allocated memory to free it, write a wrapper package and hide that
 		detail inside.
 		</p>
 	</li>
 	<li>
 		<p>
 		<b>
 		Restrict access to objects that have finalizers, cleanups, and weak
 		pointers to the package that created and applied them.
 		</b>
 		</p>
 		<p>
 		This is related to the previous point, but is worth calling out
 		explicitly, since it's a very powerful pattern for using these
 		features in a less error-prone way.
 		For example, the <a href="/pkg/unique">unique package</a> uses
 		weak pointers under the hood, but completely encasulates the objects
 		that are weakly pointed-to.
 		Those values can never be mutated by the rest of the application,
 		it can only be copied through the
 		<a href="/pkg/unique#Handle.Value">Value method</a>, preserving
 		the illusion of infinite memory for package users.
 		</p>
 	</li>
 	<li>
 		<p>
 		<b>
 		Prefer cleaning up non-memory resources deterministically when possible,
 		with finalizers and cleanups as a fallback.
 		</b>
 		</p>
 		<p>
 		Cleanups and finalizers are a good fit for memory resources such as
 		memory allocated externally, like from C, or references to an
 		<code>mmap</code> mapping.
 		Memory allocated by C's malloc must eventually be freed by C's free.
 		A finalizer that calls <code>free</code>, attached to a wrapper object
 		for the C memory, is a reasonable way to ensure that C memory is
 		eventually reclaimed as a consequence of garbage collection.
 		</p>
 		<p>
 		However, non-memory resources, like file descriptors, tend to be subject to
 		system limits that the Go runtime is generally unaware of.
 		In addition, the timing of the garbage collector in a given Go program
 		is usually something a package author has little control over (for instance,
 		how often the GC runs is controlled by <a href="#GOGC">GOGC</a>, which can
 		be set by operators to a variety of different values in practice).
 		These two facts conspire to make cleanups and finalizers a bad fit to use
 		as the only mechanism for releasing non-memory resources.
 		</p>
 		<p>
 		If you're a package author exposing an API that wraps some non-memory
 		resource, consider providing an explicit API for releasing the resource
 		deterministically (through a <code>Close</code> method, or something similar),
 		rather than relying on the garbage collector through cleanups or finalizers.
 		Instead, prefer to use cleanups and finalizers as a best-effort handler for
 		programmer mistakes, either by cleaning up the resource anyway like
 		<a href="/pkg/os#File">os.File</a> does, or by reporting the failure to
 		deterministically clean up back to the user.
 		</p>
 	</li>
 	<li>
 		<p>
 		<b>
 		Prefer cleanups to finalizers.
 		</b>
 		</p>
 		<p>
 		Historically, finalizers were added to simplify the interface between Go code
 		and C code and to clean up non-memory resources.
 		The intended use was to apply them to wrapper objects that owned C memory or
 		some other non-memory resource, so that the resource could be released once
 		Go code was done using it.
 		These reasons at least partially explain why finalizers are narrowly scoped,
 		why any given object can only have one finalizer, and why that finalizer must
 		be attached to the first byte of the object only.
 		This limitation already stifles some use-cases.
 		For example, any package that wishes to internally cache some information about
 		an object passed to it cannot clean up that information once the object is gone.
 		</p>
 		<p>
 		But worse than that, finalizers are inefficient and error-prone due to the fact
 		that they <a href="https://en.wikipedia.org/wiki/Object_resurrection">resurrect
 		the object</a> they're attached to, so that it can be passed to the finalizer
 		function (and even continue to live beyond that, too).
 		This simple fact means that if the object is part of a reference cycle it can
 		never be freed, and the memory backing the object cannot be reused until at
 		least until the following garbage collection cycle.
 		</p>
 		<p>
 		Because finalizers resurrect objects, though, they do have a better-defined
 		execution order than cleanups.
 		For this reason, finalizers are still potentially (but rarely) useful for
 		cleaning up structures that have complex destruction ordering requirements.
 		</p>
 		<p>
 		But for all other uses in Go 1.24 and beyond, we recommend you use cleanups
 		because they are more flexible, less error-prone, and more efficient than
 		finalizers.
 		</p>
 	</li>
 </ul>

 <h4 id="Common_cleanup_issues">Common cleanup issues</h4>

 <ul>
 	<li>
 		<p>
 		Objects with attached cleanups must not be reachable from the cleanup
 		function (for example, through a captured local variable).
 		This will prevent the object from being reclaimed and the cleanup from
 		ever running.
 		</p>
 	</li>
 <pre>
 f := new(myFile)
 f.fd = syscall.Open(...)
 runtime.AddCleanup(f, func(fd int) {
 	syscall.Close(f.fd) // Mistake: We reference f, so this cleanup won't run!
 }, f.fd)
 </pre>
 	<li>
 		<p>
 		Objects with attached cleanups must not be reachable from the argument
 		to the cleanup function.
 		This will prevent the object from being reclaimed and the cleanup from
 		ever running.
 		</p>
 	</li>
 <pre>
 f := new(myFile)
 f.fd = syscall.Open(...)
 runtime.AddCleanup(f, func(f *myFile) {
 	syscall.Close(f.fd)
 }, f) // Mistake: We reference f, so this cleanup wouldn't ever run. This specific case also panics.
 </pre>
 	<li>
 		<p>
 		Finalizers have a well-defined execution order, but cleanups do not.
 		Cleanups can also run concurrently with one another.
 		</p>

     </li>
     <li>
         <p>
         Long running cleanups should create a goroutine to avoid
         blocking the execution of other cleanups.
         </p>
 	</li>
 	<li>
 		<p>
 		<code>runtime.GC</code> will not wait until cleanups for unreachable
 		objects are executed, only until they are all queued.
 		</p>
 	</li>
 </ul>

 <h4 id="Common_weak_pointer_issues">Common weak pointer issues</h4>

 <ul>
 	<li>
 		<p>
 		Weak pointers can begin returning <code>nil</code> from their
 		<code>Value</code> method at unexpected times.
 		Always guard the call to <code>Value</code> with a <code>nil</code>
 		check and have a backup plan.
 		</p>
 	</li>
 	<li>
 		<p>
 		When weak pointers are used as map keys, they do not affect the
 		reachability of map values.
 		Therefore, if a weak pointer map key points to an object that is also
 		reachable from the map value, that object will still be considered
 		reachable.
 		</p>
 	</li>
 </ul>

 <h4 id="Common_finalizer_issues">Common finalizer issues</h4>

 <ul>
 	<li>
 		<p>
 		Objects with attached finalizers must not be reachable from themselves
 		by any path (in other words, they cannot be in a reference cycle).
 		This will prevent the object from being reclaimed and the finalizer from
 		ever running.
 		</p>
 	</li>
 <pre>
 f := new(myCycle)
 f.self = f // Mistake: f is reachable from f, so this finalizer would never run.
 runtime.SetFinalizer(f, func(f *myCycle) {
 	...
 })
 </pre>
 	<li>
 		<p>
 		Objects with attached finalizers must not be reachable from the finalizer
 		function (for example, through a captured local variable).
 		This will prevent the object from being reclaimed and the finalizer from
 		ever running.
 		</p>
 	</li>
 <pre>
 f := new(myFile)
 f.fd = syscall.Open(...)
 runtime.SetFinalizer(f, func(_ *myFile) {
 	syscall.Close(f.fd) // Mistake: We reference the outer f, so this cleanup won't run!
 })
 </pre>
 	<li>
 		<p>
 		Reference chains of objects with attached finalizers (say, in a linked list)
 		take, at minimum, as many GC cycles as there are objects in the chain
 		to clean them all up.
 		Keep finalizers shallow!
 		</p>
 	</li>
 <pre>
 // Mistake: reclaiming this linked list will take at least 10 GC cycles.
 node := new(linkedListNode)
 for range 10 {
 	tmp := new(linkedListNode)
 	tmp.next = node
 	node = tmp
 	runtime.SetFinalizer(node, func(node *linkedListNode) {
 		...
 	})
 }
 </pre>
 	<li>
 		<p>
 		Avoid placing finalizers on objects returned at package boundaries.
 		This makes it possible for users of your package to call
 		<code>runtime.SetFinalizer</code> to mutate the finalizer on the object
 		you return, which can be an unexpected behavior that users of your package
 		may end up relying on.
 		</p>
 	</li>
     <li>
         <p>
         Long running finalizers should create a new goroutine
         to avoid blocking the execution of other finalizers.
         </p>
     </li>
 	<li>
 		<p>
 		<code>runtime.GC</code> will not wait until finalizers for unreachable
 		objects are executed, only until they are all queued.
 		</p>
 	</li>
 </ul>

 <h4 id="Testing_object_death">Testing object death</h4>

 <p>
 When using these features, it can sometimes be tricky to write tests for code that
 uses them.
 Here are some tips for writing robust tests for code that uses these features.
 </p>

 <ul>
 	<li>
 		Avoid running such tests in parallel with other tests.
 		It helps a lot to increase determinism as much as possible and to have
 		a good handle on the state of the world at any given time.
 	</li>
 	<li>
 		Use <code>runtime.GC</code> to establish a baseline upon entering the
 		test.
 		Use <code>runtime.GC</code> to force weak pointers to <code>nil</code>,
 		and to queue up cleanups and finalizers to run.
 	</li>
 	<li>
 		<p>
 		<code>runtime.GC</code> does not wait for cleanups and finalizers to run,
 		it only queues them.
 		</p>
 		<p>
 		To write the most robust tests possible, inject a way to block on a cleanup
 		or finalizer from your test (for example, pass an optional channel to the
 		cleanup and/or finalizer from the test, and write to the channel once it
 		has finished executing).
 		If this is too hard or impossible, an alternative is to spin on a particular
 		post-cleanup state to be true.
 		For example, the <code>os</code> tests call <code>runtime.Gosched</code> in
 		a loop that checks whether a file has been closed, once it becomes
 		unreachable.
 		</p>
 	</li>
 	<li>
 		<p>
 		If writing tests for using finalizers, and you have a chain of objects
 		that use finalizers, you will need at minimum the length of the deepest
 		chain the test can create of <code>runtime.GC</code> calls to ensure
 		all the finalizers run.
 		</p>
 	</li>
 	<li>
 		<p>
 		Test in race mode to discover races between concurrent cleanups, and
 		between cleanup and finalizer code and the rest of the codebase.
 		</p>
 	</li>
 </ul>

 <!-- TODO: Add a short section about non-steady-state behavior. -->

 <h3 id="Additional_resources">Additional resources</h3>

 <p>
 While the information presented above is accurate, it lacks the detail to
 fully understand costs and trade-offs in the Go GC's design.
 For more information, see the following additional resources.
 </p>

 <ul>
 	<li>
 		<a href="https://gchandbook.org/">The GC Handbook</a>&mdash;An
 			excellent general resource and reference on garbage collector design.
 	</li>
 	<li>
 		<a href="https://google.github.io/tcmalloc/design.html">TCMalloc</a>&mdash;Design
 			document for the C/C++ memory allocator TCMalloc, which the Go memory allocator is based on.
 	</li>
 	<li>
 		<a href="/blog/go15gc">Go 1.5 GC announcement</a>&mdash;The
 			blog post announcing the Go 1.5 concurrent GC, which describes the algorithm in more detail.
 	</li>
 	<li>
 		<a href="/blog/ismmkeynote">Getting to Go</a>&mdash;An
 			in-depth presentation about the evolution of Go's GC design up to 2018.
 	</li>
 	<li>
 		<a href="https://docs.google.com/document/d/1wmjrocXIWTr1JxU-3EQBI6BK6KgtiFArkG47XK73xIQ/edit">Go 1.5 concurrent GC pacing</a>&mdash;Design
 			document for determining when to start a concurrent mark phase.
 	</li>
 	<li>
 		<a href="/issue/30333">Smarter scavenging</a>&mdash;Design
 			document for revising the way the Go runtime returns memory to the operating system.
 	</li>
 	<li>
 		<a href="/issue/35112">Scalable page allocator</a>&mdash;Design
 			document for revising the way the Go runtime manages memory it gets from the operating system.
 	</li>
 	<li>
 		<a href="/issue/44167">GC pacer redesign (Go 1.18)</a>&mdash;Design
 			document for revising the algorithm to determine when to start a concurrent mark phase.
 	</li>
 	<li>
 		<a href="/issue/48409">Soft memory limit (Go 1.19)</a>&mdash;Design
 			document for the soft memory limit.
 	</li>
 </ul>

 <h2 id="A_note_about_virtual_memory">A note about virtual memory</h2>

 <p>
 This guide has largely focused on the physical memory use of the GC, but a
 question that comes up regularly is what exactly that means and how it compares
 to virtual memory (typically presented in programs like <code>top</code> as
 "VSS").
 </p>

 <p>
 Physical memory is memory housed in the actual physical RAM chip in most
 computers.
 <a href="https://en.wikipedia.org/wiki/Virtual_memory">Virtual memory</a> is an
 abstraction over physical memory provided by the operating system to isolate
 programs from one another.
 It's also typically acceptable for programs to reserve virtual address space
 that doesn't map to any physical addresses at all.
 </p>

 <p>
 <b>
 Because virtual memory is just a mapping maintained by the operating system,
 it is typically very cheap to make large virtual memory reservations that don't
 map to physical memory.
 </b>
 </p>

 <p>
 The Go runtime generally relies upon this view of the cost of virtual memory in
 a few ways:
 </p>

 <ul>
 	<li>
 		<p>
 		The Go runtime never deletes virtual memory that it maps.
 		Instead, it uses special operations that most operating systems
 		provide to explicitly release any physical memory resources
 		associated with some virtual memory range.
 		</p>

 		<p>
 		This technique is used explicitly to manage the
 		<a href="#Memory_limit">memory limit</a> and return memory to the
 		operating system that the Go runtime no longer needs.
 		The Go runtime also releases memory it no longer needs continuously
 		in the background.
 		See <a href="#Additional_resources">the additional resources</a> for
 		more information.
 		</p>
 	</li>
 	<li>
 		<p>
 		On 32-bit platforms, the Go runtime reserves between 128 MiB and 512 MiB
 		of address space up-front for the heap to limit fragmentation issues.
 		</p>
 	</li>
 	<li>
 		<p>
 		The Go runtime uses large virtual memory address space reservations
 		in the implementation of several internal data structures.
 		On 64-bit platforms, these typically have a minimum virtual memory
 		footprint of about 700 MiB.
 		On 32-bit platforms, their footprint is negligible.
 		</p>
 	</li>
 </ul>

 <p>
 As a result, virtual memory metrics such as "VSS" in <code>top</code> are
 typically not very useful in understanding a Go program's memory footprint.
 Instead, focus on "RSS" and similar measurements, which more directly reflect
 physical memory usage.
 </p>

 <h2 id="Optimization_guide">Optimization guide</h2>

 <h3 id="Identifying_costs">Identifying costs</h3>

 <p>
 Before trying to optimize how your Go application interacts with the GC, it's
 important to first identify that the GC is a major cost in the first place.
 </p>

 </p>
 The Go ecosystem provides a number of tools for identifying costs and optimizing
 Go applications.
 For a brief overview of these tools, see the
 <a href="/doc/diagnostics">guide on diagnostics</a>.
 Here, we'll focus on a subset of these tools and a reasonable order to apply
 them in in order to understand GC impact and behavior.
 </p>

 <ol>
 	<li>
 		<p>
 		<b>CPU profiles</b>
 		</p>

 		<p>
 		A good place to start is with
 		<a href="https://pkg.go.dev/runtime/pprof#hdr-Profiling_a_Go_program">CPU profiling</a>.
 		CPU profiling provides an overview of where CPU time is spent, though to the
 		untrained eye it may be difficult to identify the magnitude of the role the
 		GC plays in a particular application.
 		Luckily, understanding how the GC fits in mostly boils down to knowing what
 		different functions in the `runtime` package mean.
 		Below is a useful subset of these functions for interpreting CPU profiles.
 		</p>

 		<p>
 		Note that the functions listed below are not leaf functions, so they may not show
 		up in the default the <code>pprof</code> tool provides with the <code>top</code>
 		command.
 		Instead, use the <code>top -cum</code> command or use the <code>list</code>
 		command on these functions directly and focus on the cumulative percent column.
 		</p>
 	</li>

 	<ul>
 		<li>
 			<p>
 			<b><code>runtime.gcBgMarkWorker</code></b>: Entrypoint to the
 			background mark worker goroutines.
 			Time spent here scales with GC frequency and the complexity and
 			size of the object graph.
 			It represents a baseline for how much time the application spends
 			marking and scanning.
 			</p>
 			<p>
 			Note that within these goroutines, you will find calls to
 			<code>runtime.gcDrainMarkWorkerDedicated</code>,
 			<code>runtime.gcDrainMarkWorkerFractional</code>, and
 			<code>runtime.gcDrainMarkWorkerIdle</code>,
 			which indicate worker type.
 			In a largely idle Go application, the Go GC is going to use up
 			additional (idle) CPU resources to get its job done faster, which
 			is indicated with the <code>runtime.gcDrainMarkWorkerIdle</code>
 			symbol.
 			As a result, time here may represent a large fraction of CPU
 			samples, which the Go GC believes are free.
 			If the application becomes more active, CPU time in idle workers
 			will drop.
 			One common reason this can happen is if an application runs entirely
 			in one goroutine but <code>GOMAXPROCS</code> is &gt;1.
 			</p>
 		</li>
 		<li>
 			<p>
 			<b><code>runtime.mallocgc</code></b>: Entrypoint to the memory
 			allocator for heap memory.
 			A large amount of cumulative time spent here (&gt;15%)
 			typically indicates a lot of memory being allocated.
 			</p>
 		</li>
 		<li>
 			<p>
 			<b><code>runtime.gcAssistAlloc</code></b>: Function goroutines
 			enter to yield some of their time to assist the GC with scanning
 			and marking.
 			A large amount of cumulative time spent here (&gt;5%) indicates
 			that the application is likely out-pacing the GC with respect to how
 			fast it's allocating.
 			It indicates a particularly high degree of impact from the GC,
 			and also represents time the application spend marking and scanning.
 			Note that this is included in the <code>runtime.mallocgc</code>
 			call tree, so it will inflate that as well.
 			</p>
 		</li>
 	</ul>

 	<li>
 		<p>
 		<b>Execution traces</b>
 		</p>

 		<p>
 		While CPU profiles are great for identifying where time is spent in
 		aggregate, they're less useful for indicating performance costs that
 		are more subtle, rare, or related to latency specifically.
 		Execution traces</a> on
 		the other hand provide a rich and deep view into a short window of
 		a Go program's execution.
 		They contain a variety of events related to the Go GC and specific
 		execution paths can be directly observed, along with how the application
 		might interact with the Go GC.
 		All the GC events tracked are conveniently labeled as such in the trace
 		viewer.
 		</p>

 		<p>
 		See the <a href="https://pkg.go.dev/runtime/trace">documentation for the
 		<code>runtime/trace</code></a> package for how to get started with
 		execution traces.
 		</p>
 	</li>

 	<li>
 		<p>
 		<b>GC traces</b>
 		</p>

 		<p>
 		When all else fails, the Go GC provides a few different specific traces
 		that provide much deeper insights into GC behavior.
 		These traces are always printed directly to STDERR, one line per GC cycle,
 		and are configured through the <code>GODEBUG</code> environment variable
 		that all Go programs recognize.
 		They're mostly useful for debugging the Go GC itself since they require
 		some familiarity with the specifics of the GC's implementation, but
 		nonetheless can occasionally be useful to gain a better understanding of
 		GC behavior.
 		</p>

 		<p>
 		The core GC trace is enabled by setting <code>GODEBUG=gctrace=1</code>.
 		The output produced by this trace is documented in the
 		<a href="https://pkg.go.dev/runtime#hdr-Environment_Variables">environment
 		variables section in the documentation for the <code>runtime</code>
 		package.</a>
 		</p>

 		<p>
 		A supplementary GC trace called the "pacer trace" provides even deeper
 		insights and is enabled by setting <code>GODEBUG=gcpacertrace=1</code>.
 		Interpreting this output requires an understanding of the GC's "pacer"
 		(see <a href="#Additional_resources">additional resources</a>), which is
 		outside the scope of this guide.
 		</p>
 	</li>
 </ol>

 <h3 id="Eliminating_heap_allocations">Eliminating heap allocations</h3>

 <p>
 One way to reduce costs from the GC is to have the GC manage fewer values to begin
 with.
 The techniques described below can produce some of the largest improvements in
 performance, because as the <a href="#GOGC">GOGC section</a> demonstrated, the
 allocation rate of a Go program is a major factor in GC frequency, the key
 cost metric used by this guide.
 </p>

 <h4 id="Heap_profiling">Heap profiling</h4>

 <p>
 After <a href="#Identifying_costs">identifying that the GC is a source of
 significant costs</a>, the next step in eliminating heap allocations is to
 find out where most of them are coming from.
 For this purpose, memory profiles (really, heap memory profiles) are very
 useful.
 Check out the <a href="https://pkg.go.dev/runtime/pprof#hdr-Profiling_a_Go_program">
 documentation</a> for how to get started with them.
 </p>

 <p>
 Memory profiles describe where in the program heap allocations come from,
 identifying them by the stack trace at the point they were allocated.
 Each memory profile can break down memory in four ways.
 </p>

 <ul>
 	<li><code>inuse_objects</code>&mdash;Breaks down the number of objects that
 	are live.</li>
 	<li><code>inuse_space</code>&mdash;Breaks down live objects by how much
 	memory they use in bytes.</li>
 	<li><code>alloc_objects</code>&mdash;Breaks down the number of objects
 	that have been allocated since the Go program began executing.</li>
 	<li><code>alloc_space</code>&mdash;Breaks down the total amount of memory
 	allocated since the Go program began executing.</li>
 </ul>

 <p>
 Switching between these different views of heap memory may be done with either
 the <code>-sample_index</code> flag to the <code>pprof</code> tool, or via the
 <code>sample_index</code> option when the tool is used interactively.
 </p>

 <p class="gc-guide-note">
 Note: memory profiles by default only sample a subset of heap objects so they
 will not contain information about every single heap allocation.
 However, this is sufficient to find hot-spots.
 To change the sampling rate, see
 <a href="https://pkg.go.dev/runtime#pkg-variables"><code>runtime.MemProfileRate</code></a>.
 </p>

 <p>
 For the purposes of reducing GC costs, <code>alloc_space</code> is typically the
 most useful view as it directly corresponds to the allocation rate.
 This view will indicate allocation hot spots that would provide the most benefit.
 </p>

 <h4 id="Escape_analysis">Escape analysis</h4>

 <p>
 Once candidate heap allocation sites have been identified with the help of
 <a href="#Heap_profiling">heap profiles</a>, how can they be eliminated?
 The key is to leverage the Go compiler's escape analysis to have the Go compiler
 find alternative, and more efficient storage for this memory, for example
 in the goroutine stack.
 Luckily, the Go compiler has the ability to describe why it decides to escape
 a Go value to the heap.
 With that knowledge, it becomes a matter of reorganizing your source code to
 change the outcome of the analysis (which is often the hardest part, but outside
 the scope of this guide).
 </p>

 <p>
 As for how to access the information from the Go compiler's escape analysis, the
 simplest way is through a debug flag supported by the Go compiler that describes
 all optimizations it applied or did not apply to some package in a text format.
 This includes whether or not values escape.
 Try the following command, where <code>[package]</code> is some Go package path.
 </p>

 <pre>
 $ go build -gcflags=-m=3 [package]
 </pre>

 <p>
 This information can also be visualized as an overlay in
 an LSP-capable editor; it is exposed as a code action.

 For example, in VS Code, invoke the "Source Action... > Show compiler
 optimization details" command to enable diagnostics for the current package.
 (You can also run the "Go: Toggle compiler optimization details"
 command.)

 Use this configuration setting to control which annotations are displayed:
 </p>

 <ol>
 	<li>
 	Enable the overlay for escape analysis by
 	<a href="https://github.com/golang/vscode-go/wiki/settings#uidiagnosticannotations">
 	setting <code>ui.diagnostic.annotations</code> to include <code>escape</code>
 	</a>.
 	</li>
 </ol>

 <p>
 Finally, the Go compiler provides this information in a machine-readable (JSON)
 format that may be used to build additional custom tooling.
 For more information on that, see the
 <a href="https://cs.opensource.google/go/go/+/master:src/cmd/compile/internal/logopt/log_opts.go;l=25;drc=351e0f4083779d8ac91c05afebded42a302a6893">
 documentation in the source Go code</a>.
 <p>

 <h3 id="Implementation-specific_optimizations">Implementation-specific optimizations</h3>

 <p>
 The Go GC is sensitive to the demographics of live memory, because a complex
 graph of objects and pointers both limits parallelism and generates more work
 for the GC.
 As a result, the GC contains a few optimizations for specific common structures.
 The most directly useful ones for performance optimization are listed below.
 </p>

 <p class="gc-guide-note">
 Note: Applying the optimizations below may reduce the readability of your code
 by obscuring intent, and may fail to hold up across Go releases.
 Prefer to apply these optimizations only in the places they matter most.
 Such places may be identified by using the tools listed in the
 <a href="#Identifying_costs">section on identifying costs</a>.
 </p>

 <ul>
 	<li>
 		<p>
 		Pointer-free values are segregated from other values.
 		</p>
 		<p>
 		As a result, it may be advantageous to eliminate pointers from data
 		structures that do not strictly need them, as this reduces the cache
 		pressure the GC exerts on the program.
 		As a result, data structures that rely on indices over pointer values,
 		while less well-typed, may perform better.
 		This is only worth doing if it's clear that the object graph is complex
 		and the GC is spending a lot of time marking and scanning.
 		</p>
 	</li>
 	<li>
 		<p>
 		The GC will stop scanning values at the last pointer in the value.
 		</p>
 		<p>
 		As a result, it may be advantageous to group pointer fields in
 		struct-typed values at the beginning of the value.
 		This is only worth doing if it's clear the application spends a lot of its
 		time marking and scanning.
 		(In theory the compiler can do this automatically, but it is not yet
 		implemented, and struct fields are arranged as written in the source
 		code.)
 		</p>
 	</li>
 </ul>

 <p>
 Furthermore, the GC must interact with nearly every pointer it sees, so using
 indices into an slice, for example, instead of pointers, can aid in reducing GC
 costs.
 </p>

 <h3 id="Linux_transparent_huge_pages">Linux transparent huge pages (THP)</h3>

 <p>
 When a program accesses memory, the CPU needs to translate the
 <a href="#A_note_about_virtual_memory">virtual memory</a> addresses it uses
 into physical memory addresses that refer to the data it was trying to access.
 To do this, the CPU consults the "page table," a data structure that represents
 the mapping from virtual to physical memory, managed by the operating system.
 Each entry in the page table represents an indivisible block of physical memory
 called a page, hence the name.
 </p>

 <p>
 Transparent huge pages (THP) is a Linux feature that transparently replaces pages of
 physical memory backing contiguous virtual memory regions with bigger blocks of memory
 called huge pages.
 By using bigger blocks, fewer page table entries are needed to represent the same memory
 region, improving page table lookup times.
 However, bigger blocks mean more waste if only a small part of the huge page is used
 by the system.
 </p>

 <p>
 When running Go programs in production, enabling transparent huge pages on Linux
 can improve throughput and latency at the cost of additional memory use.
 Applications with small heaps tend not to benefit from THP and may end up using a
 substantial amount of additional memory (as high as 50%).
 However, applications with big heaps (1 GiB or more) tend to benefit quite a bit
 (up to 10% throughput) without very much additional memory overhead (1-2% or less).
 Being aware of your THP settings in either case can be helpful, and experimentation
 is always recommended.
 </p>

 <p>
 One can enable or disable transparent huge pages in a Linux environment by modifying
 <code>/sys/kernel/mm/transparent_hugepage/enabled</code>.
 See the
 <a href="https://www.kernel.org/doc/html/next/admin-guide/mm/transhuge.html">official
 Linux admin guide</a> for more details.
 If you choose to have your Linux production environment enable transparent huge pages,
 we recommend the following additional settings for Go programs.
 </p>

 <ul>
 	<li>
 		<p>
 		Set <code>/sys/kernel/mm/transparent_hugepage/defrag</code>
 		to <code>defer</code> or <code>defer+madvise</code>.
 		<br />
 		<br />
 		This setting controls how aggressively a Linux kernel coalesces regular
 		pages into huge pages.
 		<code>defer</code> tells the kernel to coalesce huge pages lazily
 		and in the background.
 		A more aggressive setting can induce stalls in memory constrained systems
 		and can often hurt application latencies.
 		<code>defer+madvise</code> is like <code>defer</code>, but is friendlier
 		to other applications on the system that request huge pages explicitly and
 		require them for performance.
 		</p>
 	</li>
 	<li>
 		<p id="Linux_THP_max_ptes_none_workaround">
 		Set <code>/sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_none</code>
 		to <code>0</code>.
 		<br />
 		<br />
 		This setting controls how many additional pages the Linux kernel daemon
 		can allocate when trying to allocate a huge page.
 		The default setting is maximally aggressive, and can often
 		<a href="https://bugzilla.kernel.org/show_bug.cgi?id=93111">undo work the Go
 		runtime does to return memory to the OS</a>.
 		Before Go 1.21, the Go runtime tried to mitigate the negative effects of the
 		default setting, but it came with a CPU cost.
 		With Go 1.21+ and Linux 6.2+, the Go runtime no longer mutates huge page
 		state.
 		<br />
 		<br />
 		If you experience an increase in memory usage when upgrading to Go 1.21.1 or
 		later, try applying this setting; it will likely resolve your issue.
 		As an additional workaround, you can call
 		<a href="/pkg/golang.org/x/sys/unix#Prctl">the <code>Prctl</code>
 		function</a> with <code>PR_SET_THP_DISABLE</code> to disable huge pages at
 		the process level, or you can set <code>GODEBUG=disablethp=1</code> (to be
 		added in Go 1.21.6 and Go 1.22) to disable huge pages for heap memory.
 		Note that the <code>GODEBUG</code> setting may be removed in a future release.
 		</p>
 	</li>
 </ul>

 <h2 id="Appendix">Appendix</h2>

 <h3 id="Additional_notes_on_GOGC">Additional notes on GOGC</h3>

 <p>
 The <a href="#GOGC">GOGC section</a> claimed that doubling GOGC doubles heap
 memory overheads and halves GC CPU costs.
 To see why, let's break it down mathematically.
 </p>

 <p>
 Firstly, the heap target sets a target for the total heap size.
 This target, however, mainly influences the new heap memory, because the live
 heap is fundamental to the application.
 </p>

 <p class="gc-guide-equation">
 <i>
 Target heap memory = Live heap + (Live heap + GC roots) * GOGC / 100
 </i>
 </p>

 <p class="gc-guide-equation">
 <i>
 Total heap memory = Live heap + New heap memory
 </i>
 </p>

 <p class="gc-guide-equation">
 <i>
 &rArr;
 </i>
 </p>

 <p class="gc-guide-equation">
 <i>
 New heap memory = (Live heap + GC roots) * GOGC / 100
 </i>
 </p>

 <p>
 From this we can see that doubling GOGC would also double the amount of new heap
 memory that application will allocate each cycle, which captures heap memory
 overheads.
 Note that <i>Live heap + GC roots</i> is an approximation of the amount of
 memory the GC needs to scan.
 </p>

 <p>
 Next, let's look at GC CPU cost.
 Total cost can be broken down as the cost per cycle, times GC frequency over
 some time period T.
 </p>

 <p class="gc-guide-equation">
 <i>
 Total GC CPU cost = (GC CPU cost per cycle) * (GC frequency) * T
 </i>
 </p>

 <p>
 GC CPU cost per cycle can be derived from the
 <a href="#Understanding_costs">GC model</a>:
 </p>

 <p class="gc-guide-equation">
 <i>
 GC CPU cost per cycle = (Live heap + GC roots) * (Cost per byte) + Fixed cost
 </i>
 </p>

 <p>
 Note that sweep phase costs are ignored here as mark and scan costs dominate.
 </p>

 <p>
 The steady state is defined by a constant allocation rate and a constant cost
 per byte, so in the steady state we can derive a GC frequency from this new heap
 memory:
 </p>

 <p class="gc-guide-equation">
 <i>
 GC frequency = (Allocation rate) / (New heap memory) = (Allocation rate) / ((Live heap + GC roots) * GOGC / 100)
 </i>
 </p>

 <p>
 Putting this together, we get the full equation for the total cost:
 </p>

 <p class="gc-guide-equation">
 <i>
 Total GC CPU cost = (Allocation rate) / ((Live heap + GC roots) * GOGC / 100) * ((Live heap + GC roots) * (Cost per byte) + Fixed cost) * T
 </i>
 </p>

 <p>
 For a sufficiently large heap (which represents most cases), the marginal costs
 of a GC cycle dominate the fixed costs.
 This allows for a significant simplification of the total GC CPU cost formula.
 </p>

 <p class="gc-guide-equation">
 <i>
 Total GC CPU cost = (Allocation rate) / (GOGC / 100) * (Cost per byte) * T
 </i>
 </p>

 <p>
 From this simplified formula, we can see that if we double GOGC, we halve total
 GC CPU cost.
 (Note that the visualizations in this guide do simulate fixed costs, so the GC
 CPU overheads reported by them will not exactly halve when GOGC doubles.)
 Furthermore, GC CPU costs are largely determined by allocation rate and the
 cost per byte to scan memory.
 For more information on how to reduce these costs specifically, see the
 <a href="#Optimization_guide">optimization guide</a>.
 </p>

 <p class="gc-guide-note">
 Note: there exists a discrepancy between the size of the live heap, and the
 amount of that memory the GC actually needs to scan: the same size live heap but
 with a different structure will result in a different CPU cost, but the same
 memory cost, resulting a different trade-off.
 This is why the structure of the heap is part of the definition of the
 steady state.
 The heap target should arguably only include the scannable live heap as a closer
 approximation of memory the GC needs to scan, but this leads to degenerate
 behavior when there's a very small amount of scannable live heap but the live
 heap is otherwise large.
 </p>

 <script src="/js/d3.js"></script>
 <script async src="/doc/gc-guide.js"></script>