Online Multi-Task Learning with Recursive Least Squares and Recursive Kernel Methods

Gabriel R. Lencione; Fernando J. Von Zuben

Online Multi-Task Learning with Recursive Least Squares and Recursive Kernel Methods

Gabriel R. Lencione, Fernando J. Von Zuben

TL;DR

This work tackles online regression for multi-task learning by proposing two recursive approaches: MT-WRLS (primal space) and MT-OSLSSVR (kernel/dual space). Both methods are designed to honor a graph-based MTL formulation, with MT-WRLS offering exact updates at a per-instance cost of $O(d^2 \times T^2)$ and MT-OSLSSVR delivering controllable approximation through a sparsity parameter, at a cost of $O(d \times m_n^2)$. The authors establish a stacked-space representation and a multi-task kernel that align the two approaches with the batch objective, enabling real-time adaptation and strong sharing across tasks. Empirical evaluation on a wind-speed forecasting benchmark shows that MT-WRLS and MT-OSLSSVR outperform online single-task and other online MTL methods, with statistically significant gains, and the combination with Extreme Learning Machines offers variable nonlinear benefits. The work advances online MTL by delivering scalable, provably convergent recursive methods and highlights avenues for exploring nonlinear multi-task kernels in future steps.

Abstract

This paper introduces two novel approaches for Online Multi-Task Learning (MTL) Regression Problems. We employ a high performance graph-based MTL formulation and develop two alternative recursive versions based on the Weighted Recursive Least Squares (WRLS) and the Online Sparse Least Squares Support Vector Regression (OSLSSVR) strategies. Adopting task-stacking transformations, we demonstrate the existence of a single matrix incorporating the relationship of multiple tasks and providing structural information to be embodied by the MT-WRLS method in its initialization procedure and by the MT-OSLSSVR in its multi-task kernel function. Contrasting the existing literature, which is mostly based on Online Gradient Descent (OGD) or cubic inexact approaches, we achieve exact and approximate recursions with quadratic per-instance cost on the dimension of the input space (MT-WRLS) or on the size of the dictionary of instances (MT-OSLSSVR). We compare our online MTL methods to other contenders in a real-world wind speed forecasting case study, evidencing the significant gain in performance of both proposed approaches.

Online Multi-Task Learning with Recursive Least Squares and Recursive Kernel Methods

TL;DR

and MT-OSLSSVR delivering controllable approximation through a sparsity parameter, at a cost of

. The authors establish a stacked-space representation and a multi-task kernel that align the two approaches with the batch objective, enabling real-time adaptation and strong sharing across tasks. Empirical evaluation on a wind-speed forecasting benchmark shows that MT-WRLS and MT-OSLSSVR outperform online single-task and other online MTL methods, with statistically significant gains, and the combination with Extreme Learning Machines offers variable nonlinear benefits. The work advances online MTL by delivering scalable, provably convergent recursive methods and highlights avenues for exploring nonlinear multi-task kernels in future steps.

Abstract

Paper Structure (17 sections, 38 equations, 4 figures, 4 tables, 2 algorithms)

This paper contains 17 sections, 38 equations, 4 figures, 4 tables, 2 algorithms.

Introduction
Related Work
Multi-Task Learning via Structural Regularization
Online Multi-Task Learning
Proposed Methods
Graph-Based MTL Adoption and Reformulation
Multi-Task Weighted Recursive Least Squares
Multi-Task Online Sparse Least Squares Support Vector Regression
Experimental Setup
Online MTL Contenders
Nonlinear Online MTL with Extreme Learning Machines
Online Regression Benchmark
Optimization of Hyperparameters
Procedures for Comparing our Proposals with Contenders
Results and Discussion
...and 2 more sections

Figures (4)

Figure 1: Differentiated wind speed series 1, 5 and 10, each one representing a prediction task $t$, for subsets (a) $C_{1}$ (b) $C_{13}$.
Figure 2: Differentiated wind speed series 1, 5 and 10, each one representing a prediction task $t$, for subsets (a) $C_{23}$ (b) $C_{29}$.
Figure 3: Actual and Predicted values for time series $1$ and subset $C_{13}$ of Experiment I.
Figure 4: Actual and Predicted values for time series $1$ and subset $C_{23}$ of Experiment I.

Online Multi-Task Learning with Recursive Least Squares and Recursive Kernel Methods

TL;DR

Abstract

Online Multi-Task Learning with Recursive Least Squares and Recursive Kernel Methods

Authors

TL;DR

Abstract

Table of Contents

Figures (4)