The authors seek to provide a user-level library which can be used to provide transactional guarantees in main memory. Some intended use cases are to make transactional updates to metadata stored in main memory for applications such as databases, distributed file systems etc.
They are motivited to do this in order to improve the performance of a distributed filesystem they maintain, Coda, which saw performance degradation due to a reliance on another recoverable virtual memory subsystem, Camelot.
The authors describe the implementation of RVM, a user-level library which provides guarantees of atomicity and permanence in the face of process failure. It leaves the disk subsystem responsible for data permanence (through mirroring etc.), and higher level systems for nesting, distibution, and serializability of transactions. In this sense, RVM seeks to do one thing well.
Following the Unix design philosophy, RVM provides a clean, simple interface for interaction, and it is quite easy to understand what it’s doing internally and how it can fail.
RVM makes use of segments, which are the unit of persistence on a raw disk partition or backing filesystem (best used with a log structured file system). Each segment can consist of multiple regions which a process can map into its virtual address space. The disk region corresponding to the virtual addresses are separate from the segments RVM maintains.
Following the design of write ahead logging in database systems, RVM supports transactional logic by allowing processes to begin a transaction in a mapped region, execute any number of updates, and then end the transaction. RVM ensures that the log will be persisted to disk and in case of a crash, the log can be used to redo operations on data stored in the external segment.
By providing this simple interface and easily understandable architecture, RVM easily enables other higher level application logic to be built such as support for nested transactions or distributed transactions involving two phase commit. Moreover, it is portable to a range of different operating systems, and poses little cognitive burden for application developers.
Due to RVM being linked as a library to applications, all processes using RVM have a separate logging infrastructure. In a system where multiple of these processes exist and the underlying filesystem is not log structured, this can lead to degradation of performance due to higher disk latencies. Instead, support could be added for a central log for processes using RVM in some file or disk partition.
This paper follows the Unix tradition of simplicity and clean interfaces, and in that respect alone is very impactful. It stands in stark contrast to the complex, monolithic HP AutoRAID system, and instead speaks to the power of simple, understandable code in systems design. Whether or not RVM is now widely used, we should always strive to design systems with the same principles used in the RVM paper.