Table of Contents
Fetching ...

Graph Learning Across Data Silos

Xiang Zhang, Qiao Wang

TL;DR

The paper addresses learning graph topology from smooth signals when data are distributed across privacy-constrained clients (data silos). It proposes an auto-weighted multiple-graph learning framework that jointly learns personalized graphs for each client and a sparse consensus graph, with data-driven contribution weights. A tailored federated-style optimization algorithm is developed, including convergence guarantees and privacy considerations, and the approach is validated on synthetic and real datasets, showing improved graph recovery and meaningful global structures. Overall, the work enables effective graph learning under data silos, offering theoretical guarantees and practical applicability to privacy-sensitive domains.

Abstract

We consider the problem of inferring graph topology from smooth graph signals in a novel but practical scenario where data are located in distributed clients and prohibited from leaving local clients due to factors such as privacy concerns. The main difficulty in this task is how to exploit the potentially heterogeneous data of all clients under data silos. To this end, we first propose an auto-weighted multiple graph learning model to jointly learn a personalized graph for each local client and a single consensus graph for all clients. The personalized graphs match local data distributions, thereby mitigating data heterogeneity, while the consensus graph captures the global information. Moreover, the model can automatically assign appropriate contribution weights to local graphs based on their similarity to the consensus graph. We next devise a tailored algorithm to solve the induced problem, where all raw data are processed locally without leaving clients. Theoretically, we establish a provable estimation error bound and convergence analysis for the proposed model and algorithm. Finally, extensive experiments on synthetic and real data are carried out, and the results illustrate that our approach can learn graphs effectively in the target scenario.

Graph Learning Across Data Silos

TL;DR

The paper addresses learning graph topology from smooth signals when data are distributed across privacy-constrained clients (data silos). It proposes an auto-weighted multiple-graph learning framework that jointly learns personalized graphs for each client and a sparse consensus graph, with data-driven contribution weights. A tailored federated-style optimization algorithm is developed, including convergence guarantees and privacy considerations, and the approach is validated on synthetic and real datasets, showing improved graph recovery and meaningful global structures. Overall, the work enables effective graph learning under data silos, offering theoretical guarantees and practical applicability to privacy-sensitive domains.

Abstract

We consider the problem of inferring graph topology from smooth graph signals in a novel but practical scenario where data are located in distributed clients and prohibited from leaving local clients due to factors such as privacy concerns. The main difficulty in this task is how to exploit the potentially heterogeneous data of all clients under data silos. To this end, we first propose an auto-weighted multiple graph learning model to jointly learn a personalized graph for each local client and a single consensus graph for all clients. The personalized graphs match local data distributions, thereby mitigating data heterogeneity, while the consensus graph captures the global information. Moreover, the model can automatically assign appropriate contribution weights to local graphs based on their similarity to the consensus graph. We next devise a tailored algorithm to solve the induced problem, where all raw data are processed locally without leaving clients. Theoretically, we establish a provable estimation error bound and convergence analysis for the proposed model and algorithm. Finally, extensive experiments on synthetic and real data are carried out, and the results illustrate that our approach can learn graphs effectively in the target scenario.
Paper Structure (23 sections, 6 theorems, 47 equations, 8 figures, 5 tables, 1 algorithm)

This paper contains 23 sections, 6 theorems, 47 equations, 8 figures, 5 tables, 1 algorithm.

Key Result

Theorem 1

Under Assumptions assumption-0 and assumption-1-1, given $\delta$ and $\nu >0$, let $\lambda$ satisfy where $C_r:=\nu \sqrt{I\omega_{\max}(\mathbf{L}_m)}+ \sqrt{p}$, and $\omega_{\max}(\mathbf{L}_m)$ is the maximum eigenvalue of $\mathbf{L}_m$. Then, we have probability at least $1 -\exp\left( -\frac{1}{2}\left(\delta - pI\log\left(1 + \frac{\delta}{pI} \right)\right)\right)$ such that

Figures (8)

  • Figure 1: The illustration of the target scenario. The clients $\mathcal{C}_1,\dots\mathcal{C}_I$ store graphs signals $\mathbf{X}_1,\dots,\mathbf{X}_I$ generated from $I$ distinct but related graphs. The data are not allowed to leave their clients, but all clients can commute with a central server.
  • Figure 2: The results of varying data sizes of different clients. The data sizes $N_1,\dots, N_5$ are 20, 40, 60, 80, and 100, respectively.
  • Figure 3: The graphs of $\mathcal{C}_1$ learned by different methods when $N_i = 100, q = 0.5$. In (a)-(b), red edges are those in the consensus (global) graph, while blue edges are only in the local graphs.
  • Figure 4: The learned $\bm{\gamma}$ of different clients
  • Figure 5: The results of parameter sensitivity.
  • ...and 3 more figures

Theorems & Definitions (13)

  • Definition 1
  • Theorem 1
  • proof
  • Proposition 1
  • proof
  • Theorem 2
  • proof
  • Lemma 1
  • proof
  • Lemma 2
  • ...and 3 more