TransFusion: Covariate-Shift Robust Transfer Learning for High-Dimensional Regression
Zelin He, Ying Sun, Jingyuan Liu, Runze Li
TL;DR
TransFusion tackles covariate-shift robust transfer learning for high-dimensional regression with a sparse target and diverse source tasks. It introduces a fused-regularizer-based two-step procedure (co-training and local debias) to leverage source data while remaining robust to covariate shifts, and a distributed variant (D-TransFusion) enabling one-shot communication. The authors establish nonasymptotic error bounds and conditions for minimax optimality, showing that and when information from source tasks improves the target estimation rate despite shifts, with D-TransFusion achieving near-centralized performance under sufficient source information. Empirical results on simulations and MNIST-C demonstrate robust covariate-shift handling, effectiveness of task diversity, and substantial communication savings in the distributed setting.
Abstract
The main challenge that sets transfer learning apart from traditional supervised learning is the distribution shift, reflected as the shift between the source and target models and that between the marginal covariate distributions. In this work, we tackle model shifts in the presence of covariate shifts in the high-dimensional regression setting. Specifically, we propose a two-step method with a novel fused-regularizer that effectively leverages samples from source tasks to improve the learning performance on a target task with limited samples. Nonasymptotic bound is provided for the estimation error of the target model, showing the robustness of the proposed method to covariate shifts. We further establish conditions under which the estimator is minimax-optimal. Additionally, we extend the method to a distributed setting, allowing for a pretraining-finetuning strategy, requiring just one round of communication while retaining the estimation rate of the centralized version. Numerical tests validate our theory, highlighting the method's robustness to covariate shifts.
