Vertical Federated Learning Hybrid Local Pre-training

Wenguo Li; Xinling Guo; Xu Jiao; Tiancheng Huang; Xiaoran Yan; Yao Yang

Vertical Federated Learning Hybrid Local Pre-training

Wenguo Li, Xinling Guo, Xu Jiao, Tiancheng Huang, Xiaoran Yan, Yao Yang

TL;DR

VFLHLP addresses the few-overlap bottleneck in vertical federated learning by performing local pre-training on all participating parties—supervised training for the active party and self-supervised learning for passive parties—and then applying these pre-trained networks to downstream VFL. The approach integrates knowledge transfer to constrain the active partys sub-model and Scarf-based SSL initializations for passive parties during federated fine-tuning, improving performance on real-world tabular advertising datasets. Empirical results on Avazu and Criteo show large gains over vanilla VFL and several baselines, with ablation studies confirming the contribution of each component. The method offers an efficient, scalable solution for leveraging unaligned local data in multi-party VFL deployments with heterogeneous data ownership.

Abstract

Vertical Federated Learning (VFL), which has a broad range of real-world applications, has received much attention in both academia and industry. Enterprises aspire to exploit more valuable features of the same users from diverse departments to boost their model prediction skills. VFL addresses this demand and concurrently secures individual parties from exposing their raw data. However, conventional VFL encounters a bottleneck as it only leverages aligned samples, whose size shrinks with more parties involved, resulting in data scarcity and the waste of unaligned data. To address this problem, we propose a novel VFL Hybrid Local Pre-training (VFLHLP) approach. VFLHLP first pre-trains local networks on the local data of participating parties. Then it utilizes these pre-trained networks to adjust the sub-model for the labeled party or enhance representation learning for other parties during downstream federated learning on aligned data, boosting the performance of federated models. The experimental results on real-world advertising datasets, demonstrate that our approach achieves the best performance over baseline methods by large margins. The ablation study further illustrates the contribution of each technique in VFLHLP to its overall performance.

Vertical Federated Learning Hybrid Local Pre-training

TL;DR

Abstract

Paper Structure (18 sections, 6 equations, 5 figures, 4 tables, 1 algorithm)

This paper contains 18 sections, 6 equations, 5 figures, 4 tables, 1 algorithm.

Introduction
Related Work
Vertical Federated Learning
Self (Semi)-Supervised Learning in VFL
SSL for Tabular Datasets
Preliminaries
Methods
Problem Definition
Vertical Federated Learning Hybrid Local Pre-training
Supervised Pre-training
Self-supervised Pre-training
Downstream Federated Learning
Experiments and Results
Experiments on Avazu
Experiments on Criteo
...and 3 more sections

Figures (5)

Figure 1: The illustration of data partitioning in VFL scenarios. Each party holds different features, and Party 1 owns a fixed amount of labels. The aligned samples are bounded with red dashed lines, whose number occupies only a small portion of individual party samples and gets shrunk when more parties join in training.
Figure 2: The centralized VFL setting is illustrated by three parties. Party 1 has access to the network and labels on the Server.
Figure 3: Overview of VFLHLP. VFLHLP contains 3 steps: $\textcircled{1}$ supervised pre-training using local samples of the active party to train local encoder $\Theta^1$ and local prediction head $\Theta^0$; $\textcircled{2}$ self-supervised pre-training using local samples of the passive parties to train individual local encoders $\Theta^k (k>1)$; $\textcircled{3}$ downstream VFL using aligned samples with constraints of $\Theta^1$ and $\Theta^0$ and initialization by $\Theta^k (k>1)$.
Figure 4: The illustration of knowledge transfer for the active party. Scatters are data samples with red ones as aligned samples and black ones as unaligned samples. The green solid line is the model trained by all local data while the dashed blue line in the left panel represents the model trained by only aligned data. After transferring knowledge from the green solid line to the blue dashed line, we can obtain a more accurate model shown by a new dashed blue line in the right panel.
Figure 5: Test AUC ($\uparrow$) comparison of VFLHLP to baselines on Criteo dataset with a varying aligned sample size.

Vertical Federated Learning Hybrid Local Pre-training

TL;DR

Abstract

Vertical Federated Learning Hybrid Local Pre-training

Authors

TL;DR

Abstract

Table of Contents

Figures (5)