Table of Contents
Fetching ...

LSMGraph: A High-Performance Dynamic Graph Storage System with Multi-Level CSR

Song Yu, Shufeng Gong, Qian Tao, Sijie Shen, Yanfeng Zhang, Wenyuan Yu, Pengxi Liu, Zhixin Zhang, Hongfu Li, Xiaojian Luo, Ge Yu, Jingren Zhou

TL;DR

The proposed LSMGraph is a novel dynamic graph storage system that combines the write-friendly LSM-tree and the read-friendly CSR, and significantly outperforms state-of-the-art (graph) storage systems on both graph update and graph analytical workloads.

Abstract

The growing volume of graph data may exhaust the main memory. It is crucial to design a disk-based graph storage system to ingest updates and analyze graphs efficiently. However, existing dynamic graph storage systems suffer from read or write amplification and face the challenge of optimizing both read and write performance simultaneously. To address this challenge, we propose LSMGraph, a novel dynamic graph storage system that combines the write-friendly LSM-tree and the read-friendly CSR. It leverages the multi-level structure of LSM-trees to optimize write performance while utilizing the compact CSR structures embedded in the LSM-trees to boost read performance. LSMGraph uses a new memory structure, MemGraph, to efficiently cache graph updates and uses a multi-level index to speed up reads within the multi-level structure. Furthermore, LSMGraph incorporates a vertex-grained version control mechanism to mitigate the impact of LSM-tree compaction on read performance and ensure the correctness of concurrent read and write operations. Our evaluation shows that LSMGraph significantly outperforms state-of-the-art (graph) storage systems on both graph update and graph analytical workloads.

LSMGraph: A High-Performance Dynamic Graph Storage System with Multi-Level CSR

TL;DR

The proposed LSMGraph is a novel dynamic graph storage system that combines the write-friendly LSM-tree and the read-friendly CSR, and significantly outperforms state-of-the-art (graph) storage systems on both graph update and graph analytical workloads.

Abstract

The growing volume of graph data may exhaust the main memory. It is crucial to design a disk-based graph storage system to ingest updates and analyze graphs efficiently. However, existing dynamic graph storage systems suffer from read or write amplification and face the challenge of optimizing both read and write performance simultaneously. To address this challenge, we propose LSMGraph, a novel dynamic graph storage system that combines the write-friendly LSM-tree and the read-friendly CSR. It leverages the multi-level structure of LSM-trees to optimize write performance while utilizing the compact CSR structures embedded in the LSM-trees to boost read performance. LSMGraph uses a new memory structure, MemGraph, to efficiently cache graph updates and uses a multi-level index to speed up reads within the multi-level structure. Furthermore, LSMGraph incorporates a vertex-grained version control mechanism to mitigate the impact of LSM-tree compaction on read performance and ensure the correctness of concurrent read and write operations. Our evaluation shows that LSMGraph significantly outperforms state-of-the-art (graph) storage systems on both graph update and graph analytical workloads.

Paper Structure

This paper contains 23 sections, 18 figures, 3 tables.

Figures (18)

  • Figure 1: An example of a graph storage system working in an e-commerce platform, where users or items represent vertices and the interactions between users and items are regarded as edges of the graph.
  • Figure 2: An example graph and its CSR.
  • Figure 3: A classic implementation of LSM-tree.
  • Figure 4: Overall architecture of LSMGraph.
  • Figure 5: An example of MemGraph.
  • ...and 13 more figures