Table of Contents
Fetching ...

Privacy-Preserving Transfer Learning for Community Detection using Locally Distributed Multiple Networks

Xiao Guo, Xuming He, Xiangyu Chang, Shujie Ma

TL;DR

This work addresses improving community detection on a target network by leveraging multiple, locally stored source networks that are heterogeneous and privacy-protected via randomized response. The authors propose TransNet, a three-step transfer-learning framework that (1) adaptively weights source eigenspaces, (2) regularizes the weighted source eigenspace with the target eigenspace, and (3) clusters via k-means. They establish an error-bound-oracle property for the adaptive weights and show that regularization yields tighter eigenspace bounds than using the target or sources alone, enabling improved misclassification rates. Through simulations and real-data analyses (AUCS and Politics datasets), TransNet demonstrates robust improvements over single-network spectral clustering and naive distributed methods, with adaptive weighting particularly advantageous when source networks differ in privacy and informativeness. The framework offers scalable, privacy-preserving transfer learning for network data and opens avenues for extensions to multi-round communication and directed/weighted graphs.

Abstract

This paper develops a new spectral clustering-based method called TransNet for transfer learning in community detection of network data. Our goal is to improve the clustering performance of the target network using auxiliary source networks, which are heterogeneous, privacy-preserved, and locally stored across various sources. The edges of each locally stored network are perturbed using the randomized response mechanism to achieve differential privacy. Notably, we allow the source networks to have distinct privacy-preserving and heterogeneity levels as often desired in practice. To better utilize the information from the source networks, we propose a novel adaptive weighting method to aggregate the eigenspaces of the source networks multiplied by adaptive weights chosen to incorporate the effects of privacy and heterogeneity. We propose a regularization method that combines the weighted average eigenspace of the source networks with the eigenspace of the target network to achieve an optimal balance between them. Theoretically, we show that the adaptive weighting method enjoys the error-bound-oracle property in the sense that the error bound of the estimated eigenspace only depends on informative source networks. We also demonstrate that TransNet performs better than the estimator using only the target network and the estimator using only the weighted source networks.

Privacy-Preserving Transfer Learning for Community Detection using Locally Distributed Multiple Networks

TL;DR

This work addresses improving community detection on a target network by leveraging multiple, locally stored source networks that are heterogeneous and privacy-protected via randomized response. The authors propose TransNet, a three-step transfer-learning framework that (1) adaptively weights source eigenspaces, (2) regularizes the weighted source eigenspace with the target eigenspace, and (3) clusters via k-means. They establish an error-bound-oracle property for the adaptive weights and show that regularization yields tighter eigenspace bounds than using the target or sources alone, enabling improved misclassification rates. Through simulations and real-data analyses (AUCS and Politics datasets), TransNet demonstrates robust improvements over single-network spectral clustering and naive distributed methods, with adaptive weighting particularly advantageous when source networks differ in privacy and informativeness. The framework offers scalable, privacy-preserving transfer learning for network data and opens avenues for extensions to multi-round communication and directed/weighted graphs.

Abstract

This paper develops a new spectral clustering-based method called TransNet for transfer learning in community detection of network data. Our goal is to improve the clustering performance of the target network using auxiliary source networks, which are heterogeneous, privacy-preserved, and locally stored across various sources. The edges of each locally stored network are perturbed using the randomized response mechanism to achieve differential privacy. Notably, we allow the source networks to have distinct privacy-preserving and heterogeneity levels as often desired in practice. To better utilize the information from the source networks, we propose a novel adaptive weighting method to aggregate the eigenspaces of the source networks multiplied by adaptive weights chosen to incorporate the effects of privacy and heterogeneity. We propose a regularization method that combines the weighted average eigenspace of the source networks with the eigenspace of the target network to achieve an optimal balance between them. Theoretically, we show that the adaptive weighting method enjoys the error-bound-oracle property in the sense that the error bound of the estimated eigenspace only depends on informative source networks. We also demonstrate that TransNet performs better than the estimator using only the target network and the estimator using only the weighted source networks.

Paper Structure

This paper contains 29 sections, 12 theorems, 104 equations, 10 figures, 1 table, 1 algorithm.

Key Result

Theorem 1

Suppose Assumptions assu:sparse-assu:conrhoL hold, $\eta_n\lesssim \frac{1}{n\rho}$, $\mathcal{E}_{b,l}^2 \lesssim \mathcal{E}_{\theta,l}$ and $\lambda_{\min} (P_l) \gtrsim n\rho$ for all $l \in [L]$. If then with probability larger than $1-\frac{L}{n^{\kappa}}$ for some constant $\kappa>0$, the adaptive weights $\hat{w}_l$'s defined in (eq: adapweight) satisfy and we have

Figures (10)

  • Figure 1: The projection distance (the first row) and misclassification rate (the second row) of each method under Cases I and II of Experiment I (Private but non-heterogeneous).
  • Figure 2: The projection distance (the first row) and misclassification rate (the second row) of each method under the Cases I and II of Experiment II (Heterogeneous but non-private).
  • Figure 3: The projection distance (the first row) and misclassification rate (the second row) of each method under Cases I and II of Experiment III (Heterogeneous and private).
  • Figure 4: A visualization of the AUCS network data. Each sub-figure corresponds to one of the five relationships. The nodes are ordered according to underlying research groups.
  • Figure 5: The misclassification rate of each method for the AUCS network. (a)-(e) correspond to the scenarios where Work, Facebook, Leisure, Lunch, and Coauthor are considered as the target network, with the remaining networks serving as source networks.
  • ...and 5 more figures

Theorems & Definitions (25)

  • Definition 1: Adaptive weighting strategy
  • Remark 1
  • Definition 2: Informative networks
  • Theorem 1: Error-bound-oracle
  • Remark 2
  • Remark 3
  • Remark 4
  • Theorem 2
  • Remark 5
  • Remark 6
  • ...and 15 more