| # User-configurable memory target |
| |
| Author: Michael Knyszek |
| |
| Updated: 16 February 2021 |
| |
| ## Background |
| |
| Issue [#23044](https://golang.org/issue/23044) proposed the addition of some |
| kind of API to provide a "minimum heap" size; that is, the minimum heap goal |
| that the GC would ever set. |
| The purpose of a minimum heap size, as explored in that proposal, is as a |
| performance optimization: by preventing the heap from shrinking, each GC cycle |
| will get longer as the live heap shrinks further beyond the minimum. |
| |
| While `GOGC` already provides a way for Go users to trade off GC CPU time and |
| heap memory use, the argument against setting `GOGC` higher is that a live heap |
| spike is potentially dangerous, since the Go GC will use proportionally more |
| memory with a high proportional constant. |
| Instead, users (including a [high-profile account by |
| Twitch](https://blog.twitch.tv/en/2019/04/10/go-memory-ballast-how-i-learnt-to-stop-worrying-and-love-the-heap-26c2462549a2/) |
| have resorted to using a heap ballast: a large memory allocation that the Go GC |
| includes in its live heap size, but does not actually take up any resident |
| pages, according to the OS. |
| This technique thus effectively sets a minimum heap size in the runtime. |
| The main disadvantage of this technique is portability. |
| It relies on implementation-specific behavior, namely that the runtime will not |
| touch that new allocation, thereby preventing the OS from backing that space |
| with RAM on Unix-like systems. |
| It also relies on the Go GC never scanning that allocation. |
| This technique is also platform-specific, because on Windows such an allocation |
| would always count as committed. |
| |
| Today, the Go GC already has a fixed minimum heap size of 4 MiB. |
| The reasons around this minimum heap size stem largely from a failure to account |
| for alternative GC work sources. |
| See [the GC pacer problems meta-issue](https://golang.org/issue/42430) for more |
| details. |
| The problems are resolved by a [proposed GC pacer |
| redesign](https://golang.org/issue/44167). |
| |
| ## Design |
| |
| I propose the addition of the following API to the `runtime/debug` package: |
| |
| ```go |
| // SetMemoryTarget provides a hint to the runtime that it can use at least |
| // amount bytes of memory. amount is the sum total of in-ue Go-related memory |
| // that the Go runtime can measure. |
| // |
| // That explictly includes: |
| // - Space and fragmentation for goroutine stacks. |
| // - Space for runtime structures. |
| // - The size of the heap, with fragmentation. |
| // - Space for global variables (including dynamically-loaded plugins). |
| // |
| // And it explicitly excludes: |
| // - Read-only parts of the Go program, such as machine instructions. |
| // - Any non-Go memory present in the process, such as from C or another |
| // language runtime. |
| // - Memory required to maintain OS kernel resources that this process has a |
| // handle to. |
| // - Memory allocated via low-level functions in the syscall package, like Mmap. |
| // |
| // The intuition and goal with this definition is the ability to treat the Go |
| // part of any system as a black box: runtime overheads and fragmentation that |
| // are otherwise difficult to account for are explicitly included. |
| // Anything that is difficult or impossible for the runtime to measure portably |
| // is excluded. For these cases, the user is encouraged to monitor these |
| // sources for their particular system and update the memory target as |
| // necessary. |
| // |
| // The runtime is free to ignore the hint at any time. |
| // |
| // In practice, the runtime will use this hint to run the garbage collector |
| // less frequently by using up any additional memory up-front. Any memory used |
| // beyond that will obey the GOGC trade-off. |
| // |
| // If the GOGC mechanism is turned off, the hint is always ignored. |
| // |
| // Note that if the memory target is set higher than the amount of memory |
| // available on the system, the Go runtime may attempt to use all that memory, |
| // and trigger an out-of-memory condition. |
| // |
| // An amount of 0 will retract the hint. A negative amount will always be |
| // ignored. |
| // |
| // Returns the old memory target, or -1 if none was previously set. |
| func SetMemoryTarget(amount int) int |
| ``` |
| |
| The design of this feature builds off of the [proposed GC pacer |
| redesign](https://golang.org/issue/44167). |
| |
| I propose we move forward with almost exactly what issue |
| [#23044](https://golang.org/issue/23044) proposed, namely exposing the heap |
| minimum and making it configurable via a runtime API. |
| The behavior of `SetMemoryTarget` is thus analogous to the common (but |
| non-standard) Java runtime flag `-Xms` (with Adaptive Size Policy disabled). |
| With the GC pacer redesign, smooth behavior here should be straightforward to |
| ensure, as the troubles here basically boil down to the "high `GOGC`" issue |
| mentioned in that design. |
| |
| There's one missing piece and that's how to turn the hint (which is memory use) |
| into a heap goal. |
| Because the heap goal includes both stacks and globals, I propose that we |
| compute the heap goal as follows: |
| |
| ``` |
| Heap goal = amount |
| // These are runtime overheads. |
| - MemStats.GCSys |
| - Memstats.MSpanSys |
| - MemStats.MCacheSys |
| - MemStats.BuckHashSys |
| - MemStats.OtherSys |
| - MemStats.StackSys |
| // Fragmentation. |
| - (MemStats.HeapSys-MemStats.HeapInuse) |
| - (MemStats.StackInuse-(unused portions of stack spans)) |
| ``` |
| |
| What this formula leaves us with is a value that should include: |
| 1. Stack space that is actually allocated for goroutine stacks, |
| 1. Global variables (so the part of the binary mapped read-write), and |
| 1. Heap space allocated for objects. |
| These are the three factors that go into determining the `GOGC`-based heap goal |
| according to the GC pacer redesign. |
| |
| Note that while at first it might appear that this definition of the heap goal |
| will cause significant jitter in what the heap goal is actually set to, runtime |
| overheads and fragmentation tend to be remarkably stable over the lifetime of a |
| Go process. |
| |
| In an ideal world, that would be it, but as the API documentation points out, |
| there are a number of sources of memory that are unaccounted for that deserve |
| more explanation. |
| |
| Firstly, there's the read-only parts of the binary, like the instructions |
| themselves, but these parts' impact on memory use are murkier since the |
| operating system tends to de-duplicate this memory between processes. |
| Furthermore, on platforms like Linux, this memory is always evictable, down to |
| the last available page. |
| As a result, I intentionally ignore that factor here. |
| If the size of the binary is a factor, unfortunately it will be up to the user |
| to subtract out that size from the amount they pass to `SetMemoryTarget`. |
| |
| The source of memory is anything non-Go, such as C code (or, say a Python VM) |
| running in the same process. |
| These sources also need to be accounted for by the user because this could be |
| absolutely anything, and portably interacting with the large number of different |
| possibilities is infeasible. |
| Luckily, `SetMemoryTarget` is a run-time API that can be made to respond to |
| changes in external memory sources that Go could not possibly be aware of, so |
| API recommends updating the target on-line if need be. |
| |
| Another source of memory use is kernel memory. |
| If the Go process holds onto kernel resources that use memory within the kernel |
| itself, those are unaccounted for. |
| Unfortunately, while this API tries to avoid situations where the user needs to |
| make conservative estimates, this is one such case. |
| As far as I know, most systems do not associate kernel memory with a process, so |
| querying and reacting to this information is just impossible. |
| |
| The final source of memory is memory that's created by the Go program, but that |
| the runtime isn't necessarily aware of, like explicitly `Mmap`'d memory. |
| Theoretically the Go runtime could be aware of this specific case, but this is |
| tricky to account for in general given the wide range of options that can be |
| passed to `mmap`-like functionality on various platforms. |
| Sometimes it's worth accounting for it, sometimes not. |
| I believe it's best to leave that up to the user. |
| |
| To validate the design, I ran several [simulations](#simulations) of this |
| implementation. |
| In general, the runtime is resilient to a changing heap target (even one that |
| changes wildly) but shrinking the heap target significantly has the potential to |
| cause GC CPU utilization spikes. |
| This is by design: the runtime suddenly has much less runway than it thought |
| before the change, so it needs to make that up to reach its goal. |
| |
| The only issue I found with this formulation is the potential for consistent |
| undershoot in the case where the heap size is very small, mostly because we |
| place a limit on how late a GC cycle can start. |
| I think this is OK, and I don't think we should alter our current setting. |
| This choice means that in extreme cases, there may be some missed performance. |
| But I don't think it's enough to justify the additional complexity. |
| |
| ### Simulations |
| |
| These simulations were produced by the same tool as those for the [GC pacer |
| redesign](https://github.com/golang/go/issues/44167). |
| That is, |
| [github.com/mknyszek/pacer-model](https://github.com/mknyszek/pacer-model). |
| See the GC pacer design document for a list of caveats and assumptions, as well |
| as a description of each subplot, though the plots are mostly straightforward. |
| |
| **Small heap target.** |
| |
| In this scenario, we set a fairly small target (around 64 MiB) as a baseline. |
| This target is fairly close to what `GOGC` would have picked. |
| Mid-way through the scenario, the live heap grows a bit. |
| |
| ![](44309/low-heap-target.png) |
| |
| Notes: |
| - There's a small amount of overshoot when the live heap size changes, which is |
| expected. |
| - The pacer is otherwise resilient to changes in the live heap size. |
| |
| **Very small heap target.** |
| |
| In this scenario, we set a fairly small target (around 64 MiB) as a baseline. |
| This target is much smaller than what `GOGC` would have picked, since the live |
| heap grows to around 5 GiB. |
| |
| ![](44309/very-low-heap-target.png) |
| |
| Notes: |
| - `GOGC` takes over very quickly. |
| |
| **Large heap target.** |
| |
| In this scenario, we set a fairly large target (around 2 GiB). |
| This target is fairly far from what `GOGC` would have picked. |
| Mid-way through the scenario, the live heap grows a lot. |
| |
| ![](44309/high-heap-target.png) |
| |
| Notes: |
| - There's a medium amount of overshoot when the live heap size changes, which is |
| expected. |
| - The pacer is otherwise resilient to changes in the live heap size. |
| |
| **Exceed heap target.** |
| |
| In this scenario, we set a fairly small target (around 64 MiB) as a baseline. |
| This target is fairly close to what `GOGC` would have picked. |
| Mid-way through the scenario, the live heap grows enough such that we exit the |
| memory target regime and enter the `GOGC` regime. |
| |
| ![](44309/exceed-heap-target.png) |
| |
| Notes: |
| - There's a small amount of overshoot when the live heap size changes, which is |
| expected. |
| - The pacer is otherwise resilient to changes in the live heap size. |
| - The pacer smoothly transitions between regimes. |
| |
| **Exceed heap target with a high GOGC.** |
| |
| In this scenario, we set a fairly small target (around 64 MiB) as a baseline. |
| This target is fairly close to what `GOGC` would have picked. |
| Mid-way through the scenario, the live heap grows enough such that we exit the |
| memory target regime and enter the `GOGC` regime. |
| The `GOGC` value is set very high. |
| |
| ![](44309/exceed-heap-target-high-GOGC.png) |
| |
| Notes: |
| - There's a small amount of overshoot when the live heap size changes, which is |
| expected. |
| - The pacer is otherwise resilient to changes in the live heap size. |
| - The pacer smoothly transitions between regimes. |
| |
| **Change in heap target.** |
| |
| In this scenario, the heap target is set mid-way through execution, to around |
| 256 MiB. |
| This target is fairly far from what `GOGC` would have picked. |
| The live heap stays constant, meanwhile. |
| |
| ![](44309/step-heap-target.png) |
| |
| Notes: |
| - The pacer is otherwise resilient to changes in the heap target. |
| - There's no overshoot. |
| |
| **Noisy heap target.** |
| |
| In this scenario, the heap target is set once per GC and is somewhat noisy. |
| It swings at most 3% around 2 GiB. |
| This target is fairly far from what `GOGC` would have picked. |
| Mid-way through the live heap increases. |
| |
| ![](44309/low-noise-heap-target.png) |
| |
| Notes: |
| - The pacer is otherwise resilient to a noisy heap target. |
| - There's expected overshoot when the live heap size changes. |
| - GC CPU utilization bounces around slightly. |
| |
| **Very noisy heap target.** |
| |
| In this scenario, the heap target is set once per GC and is very noisy. |
| It swings at most 50% around 2 GiB. |
| This target is fairly far from what `GOGC` would have picked. |
| Mid-way through the live heap increases. |
| |
| ![](44309/high-noise-heap-target.png) |
| |
| Notes: |
| - The pacer is otherwise resilient to a noisy heap target. |
| - There's expected overshoot when the live heap size changes. |
| - GC CPU utilization bounces around, but not much. |
| |
| **Large heap target with a change in allocation rate.** |
| |
| In this scenario, we set a fairly large target (around 2 GiB). |
| This target is fairly far from what `GOGC` would have picked. |
| Mid-way through the simulation, the application begins to suddenly allocate much |
| more aggressively. |
| |
| ![](44309/heavy-step-alloc-high-heap-target.png) |
| |
| Notes: |
| - The pacer is otherwise resilient to changes in the live heap size. |
| - There's no overshoot. |
| - There's a spike in utilization that's consistent with other simulations of the |
| GC pacer. |
| - The live heap grows due to floating garbage from the high allocation rate |
| causing each GC cycle to start earlier. |
| |
| ### Interactions with other GC mechanisms |
| |
| Although listed already in the API documentation, there are a few additional |
| details I want to consider. |
| |
| #### GOGC |
| |
| The design of the new pacer means that switching between the "memory target" |
| regime and the `GOGC` regime (the regimes being defined as the mechanism that |
| determines the heap goal) is very smooth. |
| While the live heap times `1+GOGC/100` is less than the heap goal set by the |
| memory target, we are in the memory target regime. |
| Otherwise, we are in the `GOGC` regime. |
| Notice that as `GOGC` rises to higher and higher values, the range of the memory |
| target regime shrinks. |
| At infinity, meaning `GOGC=off`, the memory target regime no longer exists. |
| |
| Therefore, it's very clear to me that the memory target should be completely |
| ignored if `GOGC` is set to "off" or a negative value. |
| |
| #### Memory limit |
| |
| If we choose to also adopt an API for setting a memory limit in the runtime, it |
| would necessarily always need to override a memory target, though both could |
| plausibly be active simultaneously. |
| If that memory limit interacts with `GOGC` being set to "off," then the rule of |
| the memory target being ignored holds; the memory limit effectively acts like a |
| target in that circumstance. |
| If the two are set to an equal value, that behavior is virtually identical to |
| `GOGC` being set to "off" and *only* a memory limit being set. |
| Therefore, we need only check that these two cases behave identically. |
| Note however that otherwise that the memory target and the memory limit define |
| different regimes, so they're otherwise orthogonal. |
| While there's a fairly large gap between the two (relative to `GOGC`), the two |
| are easy to separate. |
| Where it gets tricky is when they're relatively close, and this case would need |
| to be tested extensively. |
| |
| ## Risks |
| |
| The primary risk with this proposal is adding another "knob" to the garbage |
| collector, with `GOGC` famously being the only one. |
| Lots of language runtimes provide flags and options that alter the behavior of |
| the garbage collector, but when the number of flags gets large, maintaining |
| every possible configuration becomes a daunting, if not impossible task, because |
| the space of possible configurations explodes with each knob. |
| |
| This risk is a strong reason to be judicious. |
| The bar for new knobs is high. |
| |
| But there are a few good reasons why this might still be important. |
| The truth is, this API already exists, but is considered unsupported and is |
| otherwise unmaintained. |
| The API exists in the form of heap ballasts, a concept we can thank Hyrum's Law |
| for. |
| It's already possible for an application to "fool" the garbage collector into |
| thinking there's more live memory than there actually is. |
| The downside is resizing the ballast is never going to be nearly as reactive as |
| the garbage collector itself, because it is at the mercy of the of the runtime |
| managing the user application. |
| The simple fact is performance-sensitive Go users are going to write this code |
| anyway. |
| It is worth noting that unlike a memory maximum, for instance, a memory target |
| is purely an optimization. |
| On the whole, I suspect it's better for the Go ecosystem for there to be a |
| single solution to this problem in the standard library, rather than solutions |
| that *by construction* will never be as good. |
| |
| And I believe we can mitigate some of the risks with "knob explosion." |
| The memory target, as defined above, has very carefully specified and limited |
| interactions with other (potential) GC knobs. |
| Going forward I believe a good criterion for the addition of new knobs should be |
| that a knob should only be added if it is *only* fully orthogonal with `GOGC`, |
| and nothing else. |
| |
| ## Monitoring |
| |
| I propose adding a new metric to the `runtime/metrics` package to enable |
| monitoring of the memory target, since that is a new value that could change at |
| runtime. |
| I propose the metric name `/memory/config/target:bytes` for this purpose. |
| Otherwise, it could be useful for an operator to understand which regime the Go |
| application is operating in at any given time. |
| We currently expose the `/gc/heap/goal:bytes` metric which could theoretically |
| be used to determine this, but because of the dynamic nature of the heap goal in |
| this regime, it won't be clear which regime the application is in at-a-glance. |
| |
| Therefore, I propose adding another metric `/memory/goal:bytes`. |
| This metric is analagous to `/gc/heap/goal:bytes` but is directly comparable |
| with `/memory/config/target:bytes` (that is, it includes additional overheads |
| beyond just what goes into the heap goal, it "converts back"). |
| When this metric "bottoms out" at a flat line, that should serve as a clear |
| indicator that the pacer is in the "target" regime. |
| This same metric could be reused for a memory limit in the future, where it will |
| "top out" at the limit. |
| |
| ## Documentation |
| |
| This API has an inherent complexity as it directly influences the behavior of |
| the Go garbage collector. |
| It also deals with memory accounting, a process that is infamously (and |
| unfortunately) difficult to wrap one's head around and get right. |
| Effective of use of this API will come down to having good documentation. |
| |
| The documentation will have two target audiences: software developers, and |
| systems administrators (referred to as "developers" and "operators," |
| respectively). |
| |
| For both audiences, it's incredibly important to understand exactly what's |
| included and excluded in the memory target. |
| That is why it is explicitly broken down in the most visible possible place for |
| a developer: the documentation for the API itself. |
| For the operator, the `runtime/metrics` metric definition should either |
| duplicate this documentation, or point to the API. |
| This documentation is important for immediate use and understanding, but API |
| documentation is never going to be expressive enough. |
| I propose also introducing a new document to the `doc` directory in the Go |
| source tree that explains common use-cases, extreme scenarios, and what to |
| expect in monitoring in these various situations. |
| This document should include a list of known bugs and how they might appear in |
| monitoring. |
| In other words, it should include a more formal and living version of the [GC |
| pacer meta-issue](https://golang.org/issues/42430). |
| The hard truth is that memory accounting and GC behavior are always going to |
| fall short in some cases, and it's immensely useful to be honest and up-front |
| about those cases where they're known, while always striving to do better. |
| As every other document in this directory, it would be a living document that |
| will grow as new scenarios are discovered, bugs are fixed, and new functionality |
| is made available. |
| |
| ## Alternatives considered |
| |
| Since this is a performance optimization, it's possible to do nothing. |
| But as I mentioned in [the risks section](#risks), I think there's a solid |
| justification for doing *something*. |
| |
| Another alternative I considered was to provide better hooks into the runtime to |
| allow users to implement equivalent functionality themselves. |
| Today, we provide `debug.SetGCPercent` as well as access to a number of runtime |
| statistics. |
| Thanks to work done for the `runtime/metrics` package, that information is now |
| much more efficiently accessible. |
| By exposing just the right metric, one could imagine a background goroutine that |
| calls `debug.SetGCPercent` in response to polling for metrics. |
| The reason why I ended up discarding this alternative, however, is this then |
| forces the user writing the code that relies on the implementation details of |
| garbage collector. |
| For instance, a reasonable implementation of a memory target using the above |
| mechanism would be to make an adjustment each time the heap goal changes. |
| What if future GC implementations don't have a heap goal? Furthermore, the heap |
| goal needs to be sampled; what if GCs are occurring rapidly? Should the runtime |
| expose when a GC ends? What if the new GC design is fully incremental, and there |
| is no well-defined notion of "GC end"? It suffices to say that in order to keep |
| Go implementations open to new possibilities, we should avoid any behavior that |
| exposes implementation details. |
| |
| ## Go 1 backwards compatibility |
| |
| This change only adds to the Go standard library's API surface, and is therefore |
| Go 1 backwards compatible. |
| |
| ## Implementation |
| |
| Michael Knyszek will implement this. |
| 1. Implement in the runtime. |
| 1. Extend the pacer simulation test suite with this use-case in a variety of |
| configurations. |