Table of Contents
Fetching ...

GeoLayer: Towards Low-Latency and Cost-Efficient Geo-Distributed Graph Stores with Layered Graph

Feng Yao, Xiaokang Yang, Shufeng Gong, Song Yu, Yanfeng Zhang, Ge Yu

TL;DR

GeoLayer addresses the challenge of low-latency, cost-efficient geo-distributed graph stores by jointly optimizing replica placement and request routing through a latency-aware layered graph, an overlap-centric placement strategy, and a stepwise routing approach. The Directed Heat Diffusion model captures pattern-driven heat propagation to guide region-level replication, while a layer-wise sinking and clustering scheme reduces optimization complexity and mitigates WAN heterogeneity. Empirical results on multiple datasets and workloads show substantial online speedups (up to 3.7x) and offline analytic gains (up to 3.6x) with notable WAN-cost reductions and near-optimality on small instances. The work also provides theoretical guarantees for the DHD model’s convergence, pre-caching benefits, and practical implementation insights, highlighting meaningful impact for real-world geo-distributed graph analytics.

Abstract

The inherent connectivity and dependency of graph-structured data, combined with its unique topology-driven access patterns, pose fundamental challenges to conventional data replication and request routing strategies in geo-distributed cloud storage systems. In this paper, we propose GeoLayer, a geo-distributed graph storage framework that jointly optimizes graph replica placement and pattern request routing. We first construct a latency-aware layered graph architecture that decomposes the graph topology into multiple layers, aiming to reduce the decision space and computational complexity of the optimization problem, while mitigating the impact of network heterogeneity in geo-distributed environments. Building on the layered graph, we introduce an overlap-centric replica placement scheme to accommodate the diversity of graph pattern accesses, along with a directed heat diffusion model that captures heat conduction and superposition effects to guide data allocation. For request routing, we develop a stepwise layered routing strategy that performs progressive expansion over the layered graph to efficiently retrieve the required data. Experimental results show that, compared to state-of-the-art replica placement and routing schemes, GeoLayer achieves a 1.34x - 3.67x improvement in response times for online graph pattern requests and a 1.28x - 3.56x speedup in offline graph analysis performance.

GeoLayer: Towards Low-Latency and Cost-Efficient Geo-Distributed Graph Stores with Layered Graph

TL;DR

GeoLayer addresses the challenge of low-latency, cost-efficient geo-distributed graph stores by jointly optimizing replica placement and request routing through a latency-aware layered graph, an overlap-centric placement strategy, and a stepwise routing approach. The Directed Heat Diffusion model captures pattern-driven heat propagation to guide region-level replication, while a layer-wise sinking and clustering scheme reduces optimization complexity and mitigates WAN heterogeneity. Empirical results on multiple datasets and workloads show substantial online speedups (up to 3.7x) and offline analytic gains (up to 3.6x) with notable WAN-cost reductions and near-optimality on small instances. The work also provides theoretical guarantees for the DHD model’s convergence, pre-caching benefits, and practical implementation insights, highlighting meaningful impact for real-world geo-distributed graph analytics.

Abstract

The inherent connectivity and dependency of graph-structured data, combined with its unique topology-driven access patterns, pose fundamental challenges to conventional data replication and request routing strategies in geo-distributed cloud storage systems. In this paper, we propose GeoLayer, a geo-distributed graph storage framework that jointly optimizes graph replica placement and pattern request routing. We first construct a latency-aware layered graph architecture that decomposes the graph topology into multiple layers, aiming to reduce the decision space and computational complexity of the optimization problem, while mitigating the impact of network heterogeneity in geo-distributed environments. Building on the layered graph, we introduce an overlap-centric replica placement scheme to accommodate the diversity of graph pattern accesses, along with a directed heat diffusion model that captures heat conduction and superposition effects to guide data allocation. For request routing, we develop a stepwise layered routing strategy that performs progressive expansion over the layered graph to efficiently retrieve the required data. Experimental results show that, compared to state-of-the-art replica placement and routing schemes, GeoLayer achieves a 1.34x - 3.67x improvement in response times for online graph pattern requests and a 1.28x - 3.56x speedup in offline graph analysis performance.

Paper Structure

This paper contains 22 sections, 3 theorems, 27 equations, 16 figures, 4 tables, 3 algorithms.

Key Result

Theorem 1

If $\alpha < \frac{\gamma}{(1-\gamma)\,\bigl\Vert L_{dir}^*\bigr\Vert}$, then the steady-state Equation (eq:steady_strate) admits a unique non-trivial solution $\mathcal{H}^* = \beta\,(\gamma\mathbf{1} - X^*)^{-1}Q^*$, where $X^* = \alpha\,(1-\gamma)\,L_{dir}^*$, and $\bigl\Vert L_{dir}^*\bigr\Vert$

Figures (16)

  • Figure 1: Geo-distributed graph data storage services.
  • Figure 2: An example of constructing a layered graph based on latency levels.
  • Figure 3: The conduction and superposition effects of access frequency along paths in pattern access. The color intensity represents the normalized access frequency of vertices, with darker colors indicating higher frequency.
  • Figure 4: Bridge subgraphs in cluster engage in overlap-centric diffusion to compete for regional placement.
  • Figure 5: Request routing under the online query mode.
  • ...and 11 more figures

Theorems & Definitions (5)

  • Definition 1: Bridge Graph
  • Definition 2: Bridge Subgraph
  • Theorem 1: Non-trivial steady-state existence
  • Lemma 1: Edge-Visit Probability Lower Bounds
  • Theorem 2: Pattern-Visit Guarantee