| # Proposal: Go Heap Dump Viewer |
| |
| Author(s): Michael Matloob |
| |
| Last updated: 20 July 2016 |
| |
| Discussion at https://golang.org/issue/16410 |
| |
| ## Abstract |
| |
| This proposal is for a heap dump viewer for Go programs. This proposal will provide a |
| web-based, graphical viewer as well as packages for analyzing and understanding heap |
| dumps. |
| |
| ## Background |
| |
| Sometimes Go programs use too much memory and the programmer wants to know why. Profiling |
| gives the programmer statistical information about rates of allocation, but doesn't gives |
| a specific concrete snapshot that can explain why a variable is live or how many |
| instances of a given type are live. |
| |
| There currently exists a tool written by Keith Randall |
| that takes heap dumps produced by `runtime/debug.WriteHeapDump` and converts them into |
| the hprof format which can be understood by those Java heap analysis tools, but there |
| are some issues with the tool in its current state. First, the tool is |
| out of sync with the heaps dumped by Go. In addition, that tool got its type information from |
| data structures maintained by the GC algorithm, but as the GC has advanced, it has been |
| storing less and less type information over time. Because of those issues, we'll have to |
| make major changes to the tool or perhaps rewrite the whole thing. |
| |
| Also, the process of getting a heap analysis on the screen from a running Go program involves |
| multiple tools and dependencies, and is more complicated than it needs to be. There should |
| be a simple and fast "one-click" solution to make it as easy as possible to understand |
| what's happening in a program's heap. |
| |
| ## Proposal |
| |
| TODO(matloob): Some of the details are still fuzzy, but here's the general outline of a solution: |
| |
| We'll use ELF core dumps as the source format for our heap analysis tools. We would build packages that would use the |
| debug information in the DWARF section of the dump to find the roots and reconstruct type |
| information for as much of the program as it can. Implementing this will likely involve improving |
| the DWARF data produced by the compiler. |
| |
| Windows doesn't traditionally use core files, and darwin uses mach-o as its core dump format, |
| so we'll have to provide a mechanism for users on those platforms to extract ELF core dumps |
| from their programs. |
| |
| We'd use those packages to build a graphical web-based tool for viewing and analyzing heap dumps. |
| The program would be pointed to a core dump and would serve a graphical web app that could be used |
| to analyze the heap. |
| |
| Ideally, there will be a 'one-click' solution to get from running program to dump. One possible way |
| to do this would be to add a library to expose a special HTTP handler. Requesting the page would that |
| would trigger a core dump to a user-specified location on disk while the program's running, and start |
| the heap dump viewer program. |
| |
| ## Rationale |
| |
| TODO(matloob): More through discussion. |
| |
| The primary rationale for this feature is that users want to understand the memory usage of their programs |
| and we don't currently provide convenient ways of doing that. Adding a heap dump viewer will allow us to |
| do that. |
| |
| ### Heap dump format |
| |
| There are three candidates for the format our tools will consume: the current format output by |
| the Go heap dumper, the hprof format, and the ELF format proposed here. |
| |
| The advantage of using the current format is that we already have tools that produce it and consume it. But the format |
| is non-standard and requires a strong dependence between the heap viewer and the runtime. That's been one |
| of the problems with the current viewer. And the format produced by the runtime has changed slightly in each |
| of the last few Go releases because it's tightly coupled with the Go runtime. |
| |
| The advantage of the hprof format is that there already exist many tools for analyzing hprof dumps. |
| It will be a good idea to consider this format more throughly before making a decision. On the |
| other hand many of those tools are neither polished nor easy to use. We can probably build |
| better tools tailored for Go without great effort. |
| |
| The advantage of understanding ELF is that we can use the same tools to look at cores produced when a program |
| OOMs (at least on Linux) as we do to examine heap dumps. Another benefit is that some cluster |
| environments already collect and store core files when programs fail in production. Reusing this |
| machinery would help Go programmers in those environments. And there already exist tools that grab core dumps |
| so we might be able to reduce the amount of code in the runtime for producing dumps. |
| |
| ## Compatibility |
| |
| As long as the compiler can output all necessary data needed to reconstruct type information for the heap |
| in the DWARF data, we won't need to have a strong dependency on the Go distribution. The code can live in a subrepo |
| not subject to the Go compatibility guarantee. |
| |
| ## Implementation |
| |
| The implementation will broadly consist of three parts: First, support in the compiler and runtime for dumping |
| all the data needed by the viewer; second, 'backend' tools that understand the format; and third, a 'frontend' |
| viewer for those tools. |
| |
| ### Compiler and Runtime Work |
| |
| TODO(matloob): more details |
| |
| The compiler work will mostly be a consist of filling any holes in the DWARF data that we need to recover type |
| information of data in the heap. |
| |
| If we decide to use ELF cores, we may need runtime support for dumping cores, especially on platforms that |
| don't dump cores in ELF format. |
| |
| ### Heap libraries and viewer |
| |
| We will provide a reusable library that decodes a core file as a Go object graph with partial type information. |
| Users can build their own tools based on this low-level library, but we also provide a web-based graphical tool for |
| viewing and querying heap graphs. |
| |
| These are some of the types of queries we aim to answer with the heap viewer: |
| * Show a histogram of live variables grouped by typed |
| * Which variables account for the most memory? |
| * What is a path from a GC root to this variable? |
| * How much memory would become garbage if this variable were to become unreachable |
| or this pointer to become nil? |
| * What are the inbound/outbound pointer edges to this node (variable)? |
| * How much memory is used by a variable, considering padding, alignment, and span size? |
| |
| ## Open issues (if applicable) |
| |
| Most of this proposal is open at this point, including: |
| * the heap dump format |
| * the design and implementation of the backend packages |
| * the tools we use to build the frontend client. |