Few-shot Multi-Task Learning of Linear Invariant Features with Meta Subspace Pursuit
Chaozhi Zhang, Lin Liu, Xiaoqun Zhang
TL;DR
The paper tackles data scarcity in multi-task linear regression by positing a low-rank, task-invariant subspace across tasks: $\bm{\Theta}^{*}=\mathbf{W}^{*}\mathbf{B}^{*}$ with $\mathbf{B}^{*}$ of dimension $s$. It introduces Meta Subspace Pursuit (Meta-SP), an iterative rank-$s$ subspace learning algorithm that alternates gradient updates for each task with hard thresholding on the concatenated coefficient matrix to recover $\bm{\Theta}$ and, via the right singular vectors, the invariant subspace $\mathbf{B}$. The authors provide RIP-based guarantees and convergence rates showing the estimator converges to a noise floor $O\left(\sqrt{\frac{dT\sigma^2}{m}}\right)$ under per-task samples $m=\Omega(s\log s)$, outperforming several baselines in low-data regimes. Empirical results on simulated data and a real PM2.5 air-quality dataset demonstrate that Meta-SP achieves superior accuracy and computational efficiency when data are scarce, validating the practical value of learning a shared invariant representation for few-shot multi-task learning.
Abstract
Data scarcity poses a serious threat to modern machine learning and artificial intelligence, as their practical success typically relies on the availability of big datasets. One effective strategy to mitigate the issue of insufficient data is to first harness information from other data sources possessing certain similarities in the study design stage, and then employ the multi-task or meta learning framework in the analysis stage. In this paper, we focus on multi-task (or multi-source) linear models whose coefficients across tasks share an invariant low-rank component, a popular structural assumption considered in the recent multi-task or meta learning literature. Under this assumption, we propose a new algorithm, called Meta Subspace Pursuit (abbreviated as Meta-SP), that provably learns this invariant subspace shared by different tasks. Under this stylized setup for multi-task or meta learning, we establish both the algorithmic and statistical guarantees of the proposed method. Extensive numerical experiments are conducted, comparing Meta-SP against several competing methods, including popular, off-the-shelf model-agnostic meta learning algorithms such as ANIL. These experiments demonstrate that Meta-SP achieves superior performance over the competing methods in various aspects.
