Scalable Vertical Federated Learning via Data Augmentation and Amortized Inference

Conor Hassan; Matthew Sutton; Antonietta Mira; Kerrie Mengersen

Scalable Vertical Federated Learning via Data Augmentation and Amortized Inference

Conor Hassan, Matthew Sutton, Antonietta Mira, Kerrie Mengersen

TL;DR

This paper提出 a first Bayesian framework for vertical federated learning (VFL) by leveraging asymptotically-exact data augmentation (AXDA) to create conditional independence across clients via auxiliary variables, enabling decentralized posterior inference. It develops two AXDA-based models for VFL—the augmented-variable model and the power-likelihood model—and pairs them with factorized amortized or mean-field variational approximations to achieve scalability even when augmentation grows with data and clients. Through logistic regression, Poisson multilevel regression, and a hierarchical Bayes split neural net, the work demonstrates competitive inference with privacy-friendly, decentralized updates and shows that the power-likelihood formulation often yields higher ELBOs than the augmented-variable approach. The results highlight the potential for privacy-preserving, decentralized Bayesian inference in vertically partitioned data, offering a foundation for future asynchronous updates, model selection mechanisms, and scalable Bayesian VFL deployments.

Abstract

Vertical federated learning (VFL) has emerged as a paradigm for collaborative model estimation across multiple clients, each holding a distinct set of covariates. This paper introduces the first comprehensive framework for fitting Bayesian models in the VFL setting. We propose a novel approach that leverages data augmentation techniques to transform VFL problems into a form compatible with existing Bayesian federated learning algorithms. We present an innovative model formulation for specific VFL scenarios where the joint likelihood factorizes into a product of client-specific likelihoods. To mitigate the dimensionality challenge posed by data augmentation, which scales with the number of observations and clients, we develop a factorized amortized variational approximation that achieves scalability independent of the number of observations. We showcase the efficacy of our framework through extensive numerical experiments on logistic regression, multilevel regression, and a novel hierarchical Bayesian split neural net model. Our work paves the way for privacy-preserving, decentralized Bayesian inference in vertically partitioned data scenarios, opening up new avenues for research and applications in various domains.

Scalable Vertical Federated Learning via Data Augmentation and Amortized Inference

TL;DR

Abstract

Paper Structure (23 sections, 40 equations, 5 figures, 3 tables, 2 algorithms)

This paper contains 23 sections, 40 equations, 5 figures, 3 tables, 2 algorithms.

Introduction
Preliminaries
Asymptotically-Exact Data Augmentation
Structured Federated Variational Inference
Auxiliary Variable Methods for Vertical Federated Learning
Vertical Federated Learning Motivation and Setting
Augmented-Variable Model
Power-Likelihood Model
Algorithm details
Amortized Structured Variational Approximation
Mean-field Variational Approximation
Algorithm details for Augmented-variable model
Algorithm details for Power-likelihood Model
Numerical Examples
Logistic Regression
...and 8 more sections

Figures (5)

Figure 1: Graphical illustration of the dependency structure of the proposed models: augmented-variable (left), power-likelihood (right)
Figure 2: ELBO and marginal density plots for the two-client example. Figure \ref{['fig:elbo_two_clients']} shows the ELBO values for a combination of models and variational approximations. Figure \ref{['fig:density_two_clients']} shows marginal densities for the various models using both MCMC and variational approximations.
Figure 3: ELBO and marginal density plots for the ten-client example. Figure \ref{['fig:elbo_ten_clients']} shows the ELBO values for a combination of models and variational approximations. Figure \ref{['fig:density_ten_clients']} shows marginal densities for the various models using both MCMC and variational approximations.
Figure 4: The top row shows posterior density estimates for the global mean effect $\mu_1$ of the first covariate belonging to the first client. The bottom row shows posterior density estimates for one of the $\boldsymbol{\beta}_1$ levels for the first covariate on the first client. The left column shows augmented-variable models fit using $\rho=1$, and the right shows augmented-variable models fit using $\rho=2$. The blue line shows the true parameter value.
Figure 5: Figure \ref{['fig:split_NN']} shows the split NN model. Figure \ref{['fig:hier_Bayes_split_NN']} shows our proposed hierarchical Bayes split NN. In the split NN, each client learns a function $f_{\boldsymbol{\phi}_j}$, parameterized as a NN with weights $\boldsymbol{\phi}_j$, that maps the covariates from each client to a vector $\boldsymbol{z}_j \in\mathbb{R}^N$. Each client communicates their $\boldsymbol{z}_j$ to the server, where the set of $\boldsymbol{z}_j$'s is used to create the linear predictor $\boldsymbol{\lambda}$, which is then used to evaluate the objective function, i.e. likelihood, on the server. In the hierarchical Bayes variant, the final-layer weights of the function $f_{\boldsymbol{\phi}_j}$ are denoted as $\boldsymbol{w}_j$. These weights are treated as random variables and are assigned a prior distribution. We dot-product the weights with the hidden vector $\boldsymbol{h}_j$ to parameterize the mean of the prior distribution of an additional set of random variables $\boldsymbol{z}_j$. Each client sends their respective parameters $\boldsymbol{z}_j$ to the server, and the server takes the same steps as before.

Scalable Vertical Federated Learning via Data Augmentation and Amortized Inference

TL;DR

Abstract

Scalable Vertical Federated Learning via Data Augmentation and Amortized Inference

Authors

TL;DR

Abstract

Table of Contents

Figures (5)