Exploring New Frontiers in Vertical Federated Learning: the Role of Saddle Point Reformulation

Aleksandr Beznosikov; Georgiy Kormakov; Alexander Grigorievskiy; Mikhail Rudakov; Ruslan Nazykov; Alexander Rogozin; Anton Vakhrushev; Andrey Savchenko; Martin Takáč; Alexander Gasnikov

Exploring New Frontiers in Vertical Federated Learning: the Role of Saddle Point Reformulation

Aleksandr Beznosikov, Georgiy Kormakov, Alexander Grigorievskiy, Mikhail Rudakov, Ruslan Nazykov, Alexander Rogozin, Anton Vakhrushev, Andrey Savchenko, Martin Takáč, Alexander Gasnikov

TL;DR

The paper reframes Vertical Federated Learning (VFL) as a convex–concave saddle point problem via the classical Lagrangian, enabling the use of ExtraGradient methods. It develops a deterministic baseline and a rich family of stochastic, communication-efficient extensions (compression, quantization, asynchronous participation, coordinate updates, privacy-preserving noise and encryption) with convergence guarantees. It also explores multiple reformulations (augmented Lagrangian, dual-loss, and variable-augmented forms) and extends the approach to non-convex models, accompanied by numerical experiments that demonstrate competitive performance against traditional minimization-based methods and improvements in privacy and communication efficiency. The results provide a versatile toolbox for scalable, privacy-conscious VFL, applicable to a wide range of real-world settings and data-partitioning schemes.

Abstract

The objective of Vertical Federated Learning (VFL) is to collectively train a model using features available on different devices while sharing the same users. This paper focuses on the saddle point reformulation of the VFL problem via the classical Lagrangian function. We first demonstrate how this formulation can be solved using deterministic methods. More importantly, we explore various stochastic modifications to adapt to practical scenarios, such as employing compression techniques for efficient information transmission, enabling partial participation for asynchronous communication, and utilizing coordinate selection for faster local computation. We show that the saddle point reformulation plays a key role and opens up possibilities to use mentioned extension that seem to be impossible in the standard minimization formulation. Convergence estimates are provided for each algorithm, demonstrating their effectiveness in addressing the VFL problem. Additionally, alternative reformulations are investigated, and numerical experiments are conducted to validate performance and effectiveness of the proposed approach.

Exploring New Frontiers in Vertical Federated Learning: the Role of Saddle Point Reformulation

TL;DR

Abstract

Paper Structure (36 sections, 25 theorems, 262 equations, 9 figures, 1 table, 12 algorithms)

This paper contains 36 sections, 25 theorems, 262 equations, 9 figures, 1 table, 12 algorithms.

Introduction
Our contribution
Technical preliminaries
Saddle Point Reformulation and Extragradient
Family of Modifications
Proximal modification for computational friendly losses/regularizers and constrained setting
Modification with quantization for effective communications
Modification with biased compression for more effective communications
Partial participation for asynchronous client connection
Coordinate modification for low-cost local computing
Modification with additive noise for privacy protection
Homomorphic encryption modification for coded communication
Family of Reformulations
Reformulation with additional variables
Reformulation with augmentation
...and 21 more sections

Key Result

Theorem 2

Let Assumption as:convexity_smothness hold. Let the problem (eq:vfl_lin_spp_1) be solved by Algorithm alg:EG. Then for $\gamma = \tfrac{1}{2} \min \left\{ 1; \tfrac{1}{\sqrt{\lambda_{\max}(A^T A)}}; \tfrac{1}{L_r}; \tfrac{1}{L_{\ell}} \right\},$ it holds that where $\bar{x}^K := \tfrac{1}{K}\sum_{k=0}^{K-1} x^{k+1/2}$, $\bar{z}^K := \tfrac{1}{K}\sum_{k=0}^{K-1} z^{k+1/2}$, $\bar{y}^K := \tfrac{1}

Figures (9)

Figure 1: Comparison of methods for solving the VFL problem in different formulations: minimization (GD, Nesterov) and saddle point (ADMM, ExtraGradient/Algorithm \ref{['alg:EG']}). The comparison is made on LibSVM datasets mushrooms, a9a, w8a and MNIST.
Figure 2: Comparison of methods for solving the VFL problem in different formulations: minimization (GD, Nesterov) and saddle point (ADMM, ExtraGradient/Algorithm \ref{['alg:EG']}). The comparison is made on CIFAR-10 dataset.
Figure 3: Comparison of tuned methods for solving the VFL problem in different formulations: minimization (GD, Nesterov) and saddle point (ADMM, ExtraGradient/Algorithm \ref{['alg:EG']}). The comparison is made on LibSVM datasets mushrooms, a9a, w8a and MNIST.
Figure 4: Comparison of Algorithm \ref{['alg:EG']} for solving the VFL problem with different stepsize tunings: according to Theorem \ref{['th:EG_basic_1']} and Lemma \ref{['lem:matrix']}. The comparison is made on LibSVM datasets mushrooms, a9a, w8a and MNIST.
Figure 5: Comparison of Algorithm \ref{['alg:EG_quantization']} for solving the VFL problem (\ref{['eq:vfl_lin_spp_1']}). The comparison is made on LibSVM datasets mushrooms, a9a, w8a and MNIST. The compression operator $Q = \text{RandK}\%$. The criterion for comparison is the number of full vectors transmitted. The top line reflects the work of methods on the basic problem, the bottom line solves the problem with the $\beta$-trick (see disscusion after Corollary \ref{['cor:EG_basic_1']}).
...and 4 more figures

Theorems & Definitions (44)

Definition 2
Definition 3
Theorem 2
Corollary 2
Theorem 3
Definition 4
Theorem 4
Definition 5
Theorem 5
Theorem 6
...and 34 more

Exploring New Frontiers in Vertical Federated Learning: the Role of Saddle Point Reformulation

TL;DR

Abstract

Exploring New Frontiers in Vertical Federated Learning: the Role of Saddle Point Reformulation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (44)