Table of Contents
Fetching ...

PTOPOFL: Privacy-Preserving Personalised Federated Learning via Persistent Homology

Kelly L Vomo-Donfack, Adryel Hoszu, Grégory Ginot, Ian Morilla

TL;DR

PTOPOFL is introduced, a framework that addresses both challenges simultaneously by replacing gradient communication with topological descriptors derived from persistent homology (PH), and an information-contraction theorem showing that PH descriptors leak strictly less mutual information per sample than gradients under strongly convex loss functions.

Abstract

Federated learning (FL) faces two structural tensions: gradient sharing enables data-reconstruction attacks, while non-IID client distributions degrade aggregation quality. We introduce PTOPOFL, a framework that addresses both challenges simultaneously by replacing gradient communication with topological descriptors derived from persistent homology (PH). Clients transmit only 48-dimensional PH feature vectors-compact shape summaries whose many-to-one structure makes inversion provably ill-posed-rather than model gradients. The server performs topology-guided personalised aggregation: clients are clustered by Wasserstein similarity between their PH diagrams, intra-cluster models are topology-weighted,and clusters are blended with a global consensus. We prove an information-contraction theorem showing that PH descriptors leak strictly less mutual information per sample than gradients under strongly convex loss functions, and we establish linear convergence of the Wasserstein-weighted aggregation scheme with an error floor strictly smaller than FedAvg. Evaluated against FedAvg, FedProx, SCAFFOLD, and pFedMe on a non-IID healthcare scenario (8 hospitals, 2 adversarial) and a pathological benchmark (10 clients), PTOPOFL achieves AUC 0.841 and 0.910 respectively-the highest in both settings-while reducing reconstruction risk by a factor of 4.5 relative to gradient sharing. Code is publicly available at https://github.com/MorillaLab/TopoFederatedL and data at https://doi.org/10.5281/zenodo.18827595.

PTOPOFL: Privacy-Preserving Personalised Federated Learning via Persistent Homology

TL;DR

PTOPOFL is introduced, a framework that addresses both challenges simultaneously by replacing gradient communication with topological descriptors derived from persistent homology (PH), and an information-contraction theorem showing that PH descriptors leak strictly less mutual information per sample than gradients under strongly convex loss functions.

Abstract

Federated learning (FL) faces two structural tensions: gradient sharing enables data-reconstruction attacks, while non-IID client distributions degrade aggregation quality. We introduce PTOPOFL, a framework that addresses both challenges simultaneously by replacing gradient communication with topological descriptors derived from persistent homology (PH). Clients transmit only 48-dimensional PH feature vectors-compact shape summaries whose many-to-one structure makes inversion provably ill-posed-rather than model gradients. The server performs topology-guided personalised aggregation: clients are clustered by Wasserstein similarity between their PH diagrams, intra-cluster models are topology-weighted,and clusters are blended with a global consensus. We prove an information-contraction theorem showing that PH descriptors leak strictly less mutual information per sample than gradients under strongly convex loss functions, and we establish linear convergence of the Wasserstein-weighted aggregation scheme with an error floor strictly smaller than FedAvg. Evaluated against FedAvg, FedProx, SCAFFOLD, and pFedMe on a non-IID healthcare scenario (8 hospitals, 2 adversarial) and a pathological benchmark (10 clients), PTOPOFL achieves AUC 0.841 and 0.910 respectively-the highest in both settings-while reducing reconstruction risk by a factor of 4.5 relative to gradient sharing. Code is publicly available at https://github.com/MorillaLab/TopoFederatedL and data at https://doi.org/10.5281/zenodo.18827595.
Paper Structure (60 sections, 9 theorems, 29 equations, 7 figures, 3 tables, 1 algorithm)

This paper contains 60 sections, 9 theorems, 29 equations, 7 figures, 3 tables, 1 algorithm.

Key Result

Theorem 3.2

Let $\{\mathrm{PD}_k\}_{k=1}^K$ be persistence diagrams with finite $p$-th moment ($p \geq 1$) and let $\lambda_k \geq 0$ with $\sum_k\lambda_k = 1$. Then the Fréchet mean exists.

Figures (7)

  • Figure 1: pTopoFL architecture. Each client locally computes a persistence diagram from its data and transmits only the resulting topological descriptor to the server---no raw data or gradients are shared. The server groups clients by Wasserstein similarity, performs topology-weighted intra-cluster aggregation, and blends cluster models with a global consensus before broadcasting personalised updates.
  • Figure 2: AUC-ROC comparison across 15 FL rounds. (A) Healthcare scenario: 8 non-IID hospitals, 2 adversarial. (B) Benchmark scenario: 10 clients with pathological class-distribution skew. pTopoFL (green) achieves the highest final AUC in both settings. Shaded band: $\pm$0.006 around pTopoFL.
  • Figure 3: Final AUC and convergence speed. (C) Final-round AUC for all methods (solid bars: Healthcare; faded: Benchmark). (D) Round at which each method first reaches 95% of its final AUC. SCAFFOLD oscillates under severe class imbalance, degrading its Benchmark AUC to 0.846. pFedMe converges slowly (round 5 on Healthcare). pTopoFL converges from round 1 and achieves the highest AUC in both scenarios.
  • Figure 4: Adversarial robustness under label-flip attacks. (A) Final AUC vs. attack rate (0--50% of clients adversarial). (B) AUC training curves at 0%, 30%, and 50% attack rates. pTopoFL's topological anomaly detector maintains consistent performance as the fraction of adversarial clients grows.
  • Figure 5: Topological signature stability over 20 FL rounds.$H_0$ and $H_1$ persistence entropy per client, coloured by client identity. Each client maintains a stable and distinct topological fingerprint throughout training, validating the round-0 clustering strategy.
  • ...and 2 more figures

Theorems & Definitions (14)

  • Theorem 3.2: Existence of Wasserstein Barycenter
  • Theorem 3.3: Stability of Topology-Guided Clustering
  • Corollary 3.4
  • Theorem 3.5: Exponential Suppression of Adversarial Influence
  • Theorem 3.6: Heterogeneity Variance Reduction
  • Theorem 3.7: Information Contraction of Persistent Descriptors
  • Remark 3.8
  • Theorem 3.9: Convergence of Wasserstein-Weighted FL
  • Remark 3.10
  • Proposition 3.11: Reduction of Effective Heterogeneity
  • ...and 4 more