GTX: A Transactional Graph Data System For HTAP Workloads
Libin Zhou, Walid Aref
TL;DR
GTX addresses the challenge of dynamic power-law graphs with high update and analytics demands by introducing a latch-free, in-memory transactional graph system that combines multi-version concurrency control with delta-based storage. Its key innovations include delta-chains and delta-chains indices for fast, latch-free edge operations, a hybrid commit protocol for low-latency group commits, and a latch-free block consolidation mechanism to manage overflow while preserving visibility. The approach enables high-throughput read-write transactions and preserves analytic workloads under temporal locality and hotspots, demonstrated on real-world and synthetic datasets with competitive performance. This work advances HTAP-style graph processing by delivering million-transactions-per-second throughput while maintaining graph analytics performance, making it well-suited for fraud detection, recommendations, and GNN training scenarios where dynamic graphs evolve rapidly.
Abstract
Processing, managing, and analyzing dynamic graphs are the cornerstone in multiple application domains including fraud detection, recommendation system, graph neural network training, etc. This demo presents GTX, a latch-free write-optimized transactional graph data system that supports high throughput read-write transactions while maintaining competitive graph analytics. GTX has a unique latch-free graph storage and a transaction and concurrency control protocol for dynamic power-law graphs. GTX leverages atomic operations to eliminate latches, proposes a delta-based multi-version storage, and designs a hybrid transaction commit protocol to reduce interference between concurrent operations. To further improve its throughput, we design a delta-chains index to support efficient edge lookups. GTX manages concurrency control at delta-chain level, and provides adaptive concurrency according to the workload. Real-world graph access and updates exhibit temporal localities and hotspots. Unlike other transactional graph systems that experience significant performance degradation, GTX is the only system that can adapt to temporal localities and hotspots in graph updates and maintain million-transactions-per-second throughput. GTX is prototyped as a graph library and is evaluated using a graph library evaluation tool using real and synthetic datasets.
