GeoLayer: Towards Low-Latency and Cost-Efficient Geo-Distributed Graph Stores with Layered Graph
Feng Yao, Xiaokang Yang, Shufeng Gong, Song Yu, Yanfeng Zhang, Ge Yu
TL;DR
GeoLayer addresses the challenge of low-latency, cost-efficient geo-distributed graph stores by jointly optimizing replica placement and request routing through a latency-aware layered graph, an overlap-centric placement strategy, and a stepwise routing approach. The Directed Heat Diffusion model captures pattern-driven heat propagation to guide region-level replication, while a layer-wise sinking and clustering scheme reduces optimization complexity and mitigates WAN heterogeneity. Empirical results on multiple datasets and workloads show substantial online speedups (up to 3.7x) and offline analytic gains (up to 3.6x) with notable WAN-cost reductions and near-optimality on small instances. The work also provides theoretical guarantees for the DHD model’s convergence, pre-caching benefits, and practical implementation insights, highlighting meaningful impact for real-world geo-distributed graph analytics.
Abstract
The inherent connectivity and dependency of graph-structured data, combined with its unique topology-driven access patterns, pose fundamental challenges to conventional data replication and request routing strategies in geo-distributed cloud storage systems. In this paper, we propose GeoLayer, a geo-distributed graph storage framework that jointly optimizes graph replica placement and pattern request routing. We first construct a latency-aware layered graph architecture that decomposes the graph topology into multiple layers, aiming to reduce the decision space and computational complexity of the optimization problem, while mitigating the impact of network heterogeneity in geo-distributed environments. Building on the layered graph, we introduce an overlap-centric replica placement scheme to accommodate the diversity of graph pattern accesses, along with a directed heat diffusion model that captures heat conduction and superposition effects to guide data allocation. For request routing, we develop a stepwise layered routing strategy that performs progressive expansion over the layered graph to efficiently retrieve the required data. Experimental results show that, compared to state-of-the-art replica placement and routing schemes, GeoLayer achieves a 1.34x - 3.67x improvement in response times for online graph pattern requests and a 1.28x - 3.56x speedup in offline graph analysis performance.
