FedSI: Federated Subnetwork Inference for Efficient Uncertainty Quantification

Hui Chen; Hengyu Liu; Zhangkai Wu; Xuhui Fan; Longbing Cao

FedSI: Federated Subnetwork Inference for Efficient Uncertainty Quantification

Hui Chen, Hengyu Liu, Zhangkai Wu, Xuhui Fan, Longbing Cao

TL;DR

FedSI addresses uncertainty quantification in federated learning under non-IID data by performing posterior inference on a client-specific subnetwork within the representation layers, while keeping the rest of the network deterministic. It leverages Linearized Laplace Approximation to obtain a full-covariance Gaussian posterior over a small subnetwork, identified by a Wasserstein-distance criterion that prioritizes high-variance parameters. Local updates compute a MAP estimate for the full representation, then infer a subnetwork posterior via GGN-Laplace, followed by a global aggregation that combines stochastic and deterministic components to learn a shared representation. Experiments on MNIST, FMNIST, and CIFAR-10 show that FedSI outperforms both Bayesian and non-Bayesian FL baselines in heterogeneous settings and can generalize to novel clients with low overhead.

Abstract

While deep neural networks (DNNs) based personalized federated learning (PFL) is demanding for addressing data heterogeneity and shows promising performance, existing methods for federated learning (FL) suffer from efficient systematic uncertainty quantification. The Bayesian DNNs-based PFL is usually questioned of either over-simplified model structures or high computational and memory costs. In this paper, we introduce FedSI, a novel Bayesian DNNs-based subnetwork inference PFL framework. FedSI is simple and scalable by leveraging Bayesian methods to incorporate systematic uncertainties effectively. It implements a client-specific subnetwork inference mechanism, selects network parameters with large variance to be inferred through posterior distributions, and fixes the rest as deterministic ones. FedSI achieves fast and scalable inference while preserving the systematic uncertainties to the fullest extent. Extensive experiments on three different benchmark datasets demonstrate that FedSI outperforms existing Bayesian and non-Bayesian FL baselines in heterogeneous FL scenarios.

FedSI: Federated Subnetwork Inference for Efficient Uncertainty Quantification

TL;DR

Abstract

Paper Structure (18 sections, 26 equations, 6 figures, 2 tables, 1 algorithm)

This paper contains 18 sections, 26 equations, 6 figures, 2 tables, 1 algorithm.

Introduction
Preliminaries
Federated Learning
Linearized Laplace Approximation (LLA)
FedSI: Personalized Federated Learning with Subnetwork Inference
Subnetwork Inference in Bayesian Neural Networks
The FedSI Algorithm
Experiments
Experimental Setup
Effect of Size of Subnetworks
Effect of Data Size on Classification
Main Results on Uncertainty Quantification
Generalization to Novel Clients
Effect of Local Epochs
Related Work
...and 3 more sections

Figures (6)

Figure 1: Personalized Federated Learning with Subnetwork Inference. After obtaining the MAP values, each client identifies its own subnetwork ${{\boldsymbol{\mathbf{\theta}}}}_{i,S}$ and obtains its corresponding full-covariance Gaussian posterior $q({{\boldsymbol{\mathbf{\theta}}}}_{i,S})$ through subnetwork inference (SI). Note that the decision parameters ${{\boldsymbol{\mathbf{\phi}}}}_i$ are fixed at their initial random values during this phase. Then, clients send the distribution parameters of representation parameters $\boldsymbol{\mu}^{t+1}_{\theta_i}$, $\boldsymbol{\sigma}^{t+1}_{\theta_i}$ to the server, which averages them to compute the distribution parameters of common representation parameters $\boldsymbol{\mu}^{t+1}_{\theta}$, $\boldsymbol{\sigma}^{t+1}_{\theta}$ for the next communication round.
Figure 2: Global aggregation. Before averaging: Each participating client sends the server with the updated distribution parameters of stochastic parameters $\boldsymbol{\mu}^{t+1}_{{{\boldsymbol{\mathbf{\theta}}}}_{i,S}}$, $\boldsymbol{\sigma}^{t+1}_{{{\boldsymbol{\mathbf{\theta}}}}_{i,S}}$ and deterministic parameters $\boldsymbol{\mu}^{t+1}_{{{\boldsymbol{\mathbf{\theta}}}}_{i,D}}$, $\boldsymbol{\sigma}^{t+1}_{{{\boldsymbol{\mathbf{\theta}}}}_{i,D}}$. The element of ${{\boldsymbol{\mathbf{\theta}}}}^{t+1}_{i,D}$ can be regarded as a degenerate Gaussian distribution for model averaging. After averaging: The server obtains the distribution parameters of stochastic common representation parameters $\boldsymbol{\mu}^{t+1}_{{{\boldsymbol{\mathbf{\theta}}}}_S}$, $\boldsymbol{\sigma}^{t+1}_{{{\boldsymbol{\mathbf{\theta}}}}_S}$ and deterministic common representation parameters $\boldsymbol{\mu}^{t+1}_{{{\boldsymbol{\mathbf{\theta}}}}_D}$, $\boldsymbol{\sigma}^{t+1}_{{{\boldsymbol{\mathbf{\theta}}}}_D}$. Note that the ${{\boldsymbol{\mathbf{\theta}}}}^{t+1}_D$ is transformed into stochastic parameters following a Gaussian distribution with a covariance matrix of $\alpha \mathbf{I}$.
Figure 3: Test accuracy comparison with varying ratios of the subnetwork for MLP and CNN.
Figure 4: Reliability diagram and confidence histogram of FedSI on MNIST (left), FMNIST (middle) and CIFAR-10 (right). The closer the accuracy line and the average confidence line are, the better the model is calibrated.
Figure 5: Performance comparison of different DNNs-based PFL algorithms on MNIST, FMNIST and CIFAR-10.
...and 1 more figures

FedSI: Federated Subnetwork Inference for Efficient Uncertainty Quantification

TL;DR

Abstract

FedSI: Federated Subnetwork Inference for Efficient Uncertainty Quantification

Authors

TL;DR

Abstract

Table of Contents

Figures (6)