Online Multi-Task Learning with Recursive Least Squares and Recursive Kernel Methods
Gabriel R. Lencione, Fernando J. Von Zuben
TL;DR
This work tackles online regression for multi-task learning by proposing two recursive approaches: MT-WRLS (primal space) and MT-OSLSSVR (kernel/dual space). Both methods are designed to honor a graph-based MTL formulation, with MT-WRLS offering exact updates at a per-instance cost of $O(d^2 \times T^2)$ and MT-OSLSSVR delivering controllable approximation through a sparsity parameter, at a cost of $O(d \times m_n^2)$. The authors establish a stacked-space representation and a multi-task kernel that align the two approaches with the batch objective, enabling real-time adaptation and strong sharing across tasks. Empirical evaluation on a wind-speed forecasting benchmark shows that MT-WRLS and MT-OSLSSVR outperform online single-task and other online MTL methods, with statistically significant gains, and the combination with Extreme Learning Machines offers variable nonlinear benefits. The work advances online MTL by delivering scalable, provably convergent recursive methods and highlights avenues for exploring nonlinear multi-task kernels in future steps.
Abstract
This paper introduces two novel approaches for Online Multi-Task Learning (MTL) Regression Problems. We employ a high performance graph-based MTL formulation and develop two alternative recursive versions based on the Weighted Recursive Least Squares (WRLS) and the Online Sparse Least Squares Support Vector Regression (OSLSSVR) strategies. Adopting task-stacking transformations, we demonstrate the existence of a single matrix incorporating the relationship of multiple tasks and providing structural information to be embodied by the MT-WRLS method in its initialization procedure and by the MT-OSLSSVR in its multi-task kernel function. Contrasting the existing literature, which is mostly based on Online Gradient Descent (OGD) or cubic inexact approaches, we achieve exact and approximate recursions with quadratic per-instance cost on the dimension of the input space (MT-WRLS) or on the size of the dictionary of instances (MT-OSLSSVR). We compare our online MTL methods to other contenders in a real-world wind speed forecasting case study, evidencing the significant gain in performance of both proposed approaches.
