Additive-Effect Assisted Learning
Jiawei Zhang, Yuhong Yang, Jie Ding
TL;DR
AE-AL addresses decentralized learning when partners hold undisclosed covariates and communication is costly. It introduces a two-stage framework: a privacy-preserving screening stage using a Wald-type test with sketchy data, followed by an iterative training stage where two parties exchange linear combinations of covariates to obtain an oracle-equivalent model with exponentially fast convergence in transmission rounds. The approach applies to generalized linear models under standard regularity conditions, with theoretical guarantees for the screening statistic and the asymptotic normality of the AE-AL estimator, and it demonstrates superior convergence speed and robustness in simulations and real-data tasks. The practical impact lies in enabling privacy-aware, communication-efficient collaboration across institutions or devices while retaining near-centralized predictive performance.
Abstract
It is quite popular nowadays for researchers and data analysts holding different datasets to seek assistance from each other to enhance their modeling performance. We consider a scenario where different learners hold datasets with potentially distinct variables, and their observations can be aligned by a nonprivate identifier. Their collaboration faces the following difficulties: First, learners may need to keep data values or even variable names undisclosed due to, e.g., commercial interest or privacy regulations; second, there are restrictions on the number of transmission rounds between them due to e.g., communication costs. To address these challenges, we develop a two-stage assisted learning architecture for an agent, Alice, to seek assistance from another agent, Bob. In the first stage, we propose a privacy-aware hypothesis testing-based screening method for Alice to decide on the usefulness of the data from Bob, in a way that only requires Bob to transmit sketchy data. Once Alice recognizes Bob's usefulness, Alice and Bob move to the second stage, where they jointly apply a synergistic iterative model training procedure. With limited transmissions of summary statistics, we show that Alice can achieve the oracle performance as if the training were from centralized data, both theoretically and numerically.
