Table of Contents
Fetching ...

Federated Inference: Toward Privacy-Preserving Collaborative and Incentivized Model Serving

Jungwon Seo, Ferhat Ozgur Catak, Chunming Rong, Jaeyeon Jang

TL;DR

This work formalizes FI as a protected collaborative computation, analyzes its core design dimensions, and examines the structural trade-offs that arise when privacy constraints, non-IID data, and limited observability are jointly imposed at inference time.

Abstract

Federated Inference (FI) studies how independently trained and privately owned models can collaborate at inference time without sharing data or model parameters. While recent work has explored secure and distributed inference from disparate perspectives, a unified abstraction and system-level understanding of FI remain lacking. This paper positions FI as a distinct collaborative paradigm, complementary to federated learning, and identifies two fundamental requirements that govern its feasibility: inference-time privacy preservation and meaningful performance gains through collaboration. We formalize FI as a protected collaborative computation, analyze its core design dimensions, and examine the structural trade-offs that arise when privacy constraints, non-IID data, and limited observability are jointly imposed at inference time. Through a concrete instantiation and empirical analysis, we highlight recurring friction points in privacy-preserving inference, ensemble-based collaboration, and incentive alignment. Our findings suggest that FI exhibits system-level behaviors that cannot be directly inherited from training-time federation or classical ensemble methods. Overall, this work provides a unifying perspective on FI and outlines open challenges that must be addressed to enable practical, scalable, and privacy-preserving collaborative inference systems.

Federated Inference: Toward Privacy-Preserving Collaborative and Incentivized Model Serving

TL;DR

This work formalizes FI as a protected collaborative computation, analyzes its core design dimensions, and examines the structural trade-offs that arise when privacy constraints, non-IID data, and limited observability are jointly imposed at inference time.

Abstract

Federated Inference (FI) studies how independently trained and privately owned models can collaborate at inference time without sharing data or model parameters. While recent work has explored secure and distributed inference from disparate perspectives, a unified abstraction and system-level understanding of FI remain lacking. This paper positions FI as a distinct collaborative paradigm, complementary to federated learning, and identifies two fundamental requirements that govern its feasibility: inference-time privacy preservation and meaningful performance gains through collaboration. We formalize FI as a protected collaborative computation, analyze its core design dimensions, and examine the structural trade-offs that arise when privacy constraints, non-IID data, and limited observability are jointly imposed at inference time. Through a concrete instantiation and empirical analysis, we highlight recurring friction points in privacy-preserving inference, ensemble-based collaboration, and incentive alignment. Our findings suggest that FI exhibits system-level behaviors that cannot be directly inherited from training-time federation or classical ensemble methods. Overall, this work provides a unifying perspective on FI and outlines open challenges that must be addressed to enable practical, scalable, and privacy-preserving collaborative inference systems.
Paper Structure (69 sections, 20 equations, 6 figures, 10 tables, 2 algorithms)

This paper contains 69 sections, 20 equations, 6 figures, 10 tables, 2 algorithms.

Figures (6)

  • Figure 1: System architecture and workflow of FedSEI. Models owned by different organizations are additively secret-shared across multiple SMPC parties, which jointly perform privacy-preserving inference and ensemble aggregation. Lower-script indices (e.g., $M_A$) denote model owners, while upper-script indices (e.g., $M_A^k$, $x^k$, $y^k$) denote additive secret shares of models, inputs, and outputs held by SMPC party $k$. The abstract protected values $\llbracket x \rrbracket$, $\llbracket M_i \rrbracket$, and $\llbracket y \rrbracket$ used in the main text correspond to the collection of these shares.
  • Figure 2: Representative network deployment scenarios used in the evaluation. Node locations are shown schematically for illustration and do not reflect exact locations. Each link is annotated by RTT and available bandwidth (BW).
  • Figure 3: Label distribution across clients under Dirichlet-based non-IID partitioning on CIFAR-10. Each subplot shows a stacked bar chart where the x-axis denotes client indices, the y-axis indicates the number of samples, and colors represent class labels. (Top) Effect of varying the Dirichlet concentration parameter $\alpha$ with $K=5$. Smaller $\alpha$ results in more heterogeneous and imbalanced client data. (Bottom) Effect of varying the number of clients $K$ with $\alpha=0.1$, leading to smaller per-client datasets while preserving non-IID characteristics.
  • Figure 4: Test accuracy of a single best model and a soft-voting ensemble under different non-IID levels on CIFAR-10 (K=5).
  • Figure 5: Reward fairness as a function of the non-IID level $\alpha$ across different datasets. Higher values indicate fairer reward allocation. Results are shown for five clients using LeNet models trained on heterogeneous local datasets.
  • ...and 1 more figures