Authors: Jerrin Shaji George, Mohit Verma, Rajesh Venkatasubramanian, Pratap Subrahmanyam
Last updated: January 20, 2021
Discussion at https://golang.org/issue/43810.
Persistent memory is a new memory technology that allows byte-addressability at DRAM-like access speed and provides disk-like persistence. Operating systems such as Linux and Windows server already support persistent memory and the hardware is available commercially in servers. More details on this technology can be found at pmem.io.
This is a proposal to add native support for programming persistent memory in Go. A detailed design of our approach to add this support is described in our 2020 USENIX ATC paper go-pmem. An implementation of the above design based on Go 1.15 release is available here.
Persistent Memory is a new type of random-access memory that offers persistence and byte-level addressability at DRAM-like access speed. Operating systems provide the capability to mmap this memory to an application‘s virtual address space. Applications can then use this mmap’d region just like memory. Durable data updates made to persistent memory can be retrieved by an application even after a crash/restart.
Applications using persistent memory benefit in a number of ways. Since durable data updates made to persistent memory is non-volatile, applications no longer need to marshal data between DRAM and storage devices. A significant portion of application code that used to do this heavy-lifting can now be retired. Another big advantage is a significant reduction in application startup times on restart. This is because applications no longer need to transform their at-rest data into an in-memory representation. For example, commercial applications like SAP HANA report a 12x improvement in startup times using persistent memory.
This proposal is to provide first-class native support for Persistent memory in Go. Our design modifies Go 1.15 to introduce a garbage collected persistent heap. We also instrument the Go compiler to introduce semantics that enables transactional updates to persistent-memory datastructures. We call our modified Go suite as go-pmem. A Redis database developed with using go-pmem offers more than 5x throughput compared to Redis running on NVMe SSD.
We propose adding native support for programming persistent memory in Go. This requires making the following features available in Go:
To support these features, we extended the Go runtime and added a new SSA pass in our implementation as discussed below.
There exists libraries such as Intel PMDK that provides C and C++ developers support for persistent memory programming. Other programming languages such as Java and Python are exploring ways to enable efficient access to persistent memory. E.g.,
But no language provide a native persistent memory programming support. We believe this is an impediment to widespread adoption to this technology. This proposal attempts to remedy this problem by making Go the first language to completely support persistent memory.
The C libraries expose a programming model significantly different (and complex) than existing programming models. In particular, memory management becomes difficult with libraries. A missed “free” call can lead to memory leaks and persistent memory leaks become permanent and do not vanish after application restarts. In a language with a managed runtime such as Go, providing visibility to its garbage collector into a memory region managed by a library becomes very difficult. Identifying and instrumenting stores to persistent memory data to provide transactional semantics also requires programming language change. In our implementation experience, the Go runtime and compiler was easily amenable to add these capabilities.
Our current changes preserve the Go 1.x future compatibility promise. It does not break compatibility for programs not using any persistent memory features exposed by go-pmem.
Having said that, we acknowledge a few downsides with our current design:
We store memory allocator metadata in persistent memory. When a program restarts, we use these metadata to recreate the program state of the memory allocator and garbage collector. As with any persistent data, we need to maintain the data layout of this metadata. Any changes to Go memory allocator's datastructure layout can break backward compatibility with our persistent metadata. This can be fixed by developing an offline tool which can do this data format conversion or by embedding this capability in go-pmem.
We currently add three new Go keywords : pnew, pmake and txn. pnew, pmake are persistent memory allocation APIs and txn is used to demarcate transactional updates to data structures. We have explored a few ways to avoid making these language changes as described below.
a) pnew/pmake
The availability of generics support in a future version of Go can help us avoid introducing these memory allocation functions. They can instead be functions exported by a Go package.
func Pnew[T any](_ T) *T { ptr := runtime.pnew(T) return ptr } func Pmake[T any](_ T, len, cap int) []T { slc := runtime.pmake([]T, len, cap) return slc }
runtime.pnew
and runtime.pmake
would be special functions that can take a type as arguments. They then behave very similar to the new()
and make()
APIs but allocate objects in the persistent memory heap.
b) txn
An alternative approach would be to define a new Go pragma that identifies a transactional block of code. It could have the following syntax:
//go:transactional { // transactional data updates }
Another alternative approach can be to use closures with the help of a few runtime and compiler changes. For example, something like this can work:
runtime.Txn() foo()
Internally, this would be similar to how Go compiler instruments stores when mrace/msan flag is passed while compiling. In this case, writes inside function foo() will be instrumented and foo() will be executed transactionally.
See this playground code for a complete code listing with our proposed alternatives.
Our implementation is based on a fork of Go source code version Go 1.15. Our implementation adds three new keywords to Go: pnew, pmake and txn. pnew and pmake are persistent memory allocation APIs and txn is used to demarcate a block of transaction data update to persistent memory.
func pnew(Type) *Type
Just like new
, pnew
creates a zero-value object of the Type
argument in persistent memory and returns a pointer to this object.
func pmake(t Type, size ...IntType) Type
The pmake
API is used to create a slice in persistent memory. The semantics of pmake
is exactly the same as make
in Go. We don't yet support creating maps and channels in persistent memory.
txn() { // transaction data updates }
Our code changes to Go can be broken down into two parts - runtime changes and compiler-SSA changes.
We extend the Go runtime to support persistent memory allocations. The garbage collector now works across both the persistent and volatile heaps. The mspan
datastructure has one additional data member memtype
to distinguish between persistent and volatile spans. We also extend various memory allocator datastructures in mcache, mcentral, and mheap to store metadata related to persistent memory and volatile memory separately. The garbage collector now understands these different span types and puts back garbage collected spans in the appropriate datastructures depending on its memtype
.
Persistent memory is managed in arenas that are a multiple of 64MB. Each persistent memory arena has in its header section certain metadata that facilitates heap recovery in case of application crash or restart. Two kinds of metadata are stored:
We added the following APIs in the runtime package to manage persistent memory:
1 func PmemInit(fname string) (unsafe.Pointer, error)
Used to initialize persistent memory. It takes the path to a persistent memory file as input. It returns the application root pointer and an error value.
2 func SetRoot(addr unsafe.Pointer) (err Error)
Used to set the application root pointer. All application data in persistent memory hangs off this root pointer.
3 func GetRoot() (addr unsafe.Pointer)
Returns the root pointer set using SetRoot().
4 func InPmem(addr unsafe.Pointer) bool
Returns whether addr
points to data in persistent memory or not.
func PersistRange(addr unsafe.Pointer, len uintptr)
Flushes all the cachelines in the address range (addr, addr+len) to ensure any data updates to this memory range is persistently stored.
We change the parser to recognize three new language tokens - pnew
, pmake
, and txn
.
We add a new SSA pass to instrument all stores to persistent memory. Because data in persistent memory survives crashes, updates to data in persistent memory have to be transactional.
The Go AST and SSA was modified so that users can now demarcate a block of Go code as transactional by encapsulating them within a txn()
block.
txn
.OpStore
/OpMove
/OpZero
) to persistent memory locations within this txn()
block, and store the old data at this location in an undo Log. This would be done before making the actual memory update.We have developed two packages that makes it easier to use go-pmem to write persistent memory applications.
It provides a simple Init(fname string) bool
API that applications can use to initialize persistent memory. It returns if this is a first-time initialization or not. In case it is not the first-time initialization, any incomplete transactions are reverted as well.
pmem package also provides named objects where names can be associated with objects in persistent memory. Users can create and retrieve these objects using string names.
Transaction package provides the implementation of undo logging that is used by go-pmem to enable crash-consistent data updates.
Below is a simple linked list application written using go-pmem
// A simple linked list application. On the first invocation, it creates a // persistent memory pointer named "dbRoot" which holds pointers to the first // and last element in the linked list. On each run, a new node is added to // the linked list and all contents of the list are printed. package main import ( "github.com/vmware/go-pmem-transaction/pmem" "github.com/vmware/go-pmem-transaction/transaction" ) const ( // Used to identify a successful initialization of the root object magic = 0x1B2E8BFF7BFBD154 ) // Structure of each node in the linked list type entry struct { id int next *entry } // The root object that stores pointers to the elements in the linked list type root struct { magic int head *entry tail *entry } // A function that populates the contents of the root object transactionally func populateRoot(rptr *root) { txn() { rptr.magic = magic rptr.head = nil rptr.tail = nil } } // Adds a node to the linked list and updates the tail (and head if empty) func addNode(rptr *root) { entry := pnew(entry) txn() { entry.id = rand.Intn(100) if rptr.head == nil { rptr.head = entry } else { rptr.tail.next = entry } rptr.tail = entry } } func main() { firstInit := pmem.Init("database") var rptr *root if firstInit { // Create a new named object called dbRoot and point it to rptr rptr = (*root)(pmem.New("dbRoot", rptr)) populateRoot(rptr) } else { // Retrieve the named object dbRoot rptr = (*root)(pmem.Get("dbRoot", rptr)) if rptr.magic != magic { // An object named dbRoot exists, but its initialization did not // complete previously. populateRoot(rptr) } } addNode(rptr) // Add a new node in the linked list }