DGAP: Efficient Dynamic Graph Analysis on Persistent Memory
Abdullah Al Raqibul Islam, Dong Dai
TL;DR
DGAP addresses the challenge of performing persistent, crash-safe dynamic graph analysis on byte-addressable persistent memory by adopting a single mutable CSR and augmenting it with a per-section edge log and per-thread undo logs. It keeps the vertex metadata in DRAM to avoid costly in-place updates and uses a crash-consistent PMA rebalancing mechanism to maintain performance. Across real-world graphs, DGAP achieves up to $3.2\times$ faster graph updates and up to $3.77\times$ faster graph analyses than state-of-the-art PM-based frameworks, demonstrating that persistent memory can effectively support both persistent updates and high-performance analytics. The work highlights practical design choices for PM-based graph systems and suggests promising directions for future improvements, including Copy-on-Write degree caching and distributed PM techniques.
Abstract
Dynamic graphs, featuring continuously updated vertices and edges, have grown in importance for numerous real-world applications. To accommodate this, graph frameworks, particularly their internal data structures, must support both persistent graph updates and rapid graph analysis simultaneously, leading to complex designs to orchestrate `fast but volatile' and `persistent but slow' storage devices. Emerging persistent memory technologies, such as Optane DCPMM, offer a promising alternative to simplify the designs by providing data persistence, low latency, and high IOPS together. In light of this, we propose DGAP, a framework for efficient dynamic graph analysis on persistent memory. Unlike traditional dynamic graph frameworks, which combine multiple graph data structures (e.g., edge list or adjacency list) to achieve the required performance, DGAP utilizes a single mutable Compressed Sparse Row (CSR) graph structure with new designs for persistent memory to construct the framework. Specifically, DGAP introduces a \textit{per-section edge log} to reduce write amplification on persistent memory; a \textit{per-thread undo log} to enable high-performance, crash-consistent rebalancing operations; and a data placement schema to minimize in-place updates on persistent memory. Our extensive evaluation results demonstrate that DGAP can achieve up to $3.2\times$ better graph update performance and up to $3.77\times$ better graph analysis performance compared to state-of-the-art dynamic graph frameworks for persistent memory, such as XPGraph, LLAMA, and GraphOne.
