Table of Contents
Fetching ...

A Multi-Task Learning Approach to Linear Multivariate Forecasting

Liran Nochumsohn, Hedi Zisling, Omri Azencot

TL;DR

This paper reframes multivariate time series forecasting as a multi-task learning problem to exploit inter-variate relationships. It analyzes linear TSF gradients to reveal that task directions align with variate directions and that gradient magnitudes scale with prediction error, motivating a correlation-based variate grouping and an error-aware gradient scaling. The proposed MTLinear framework uses a multi-head linear model with one head per variate group, enhanced by a simple, constant-time gradient penalty, and demonstrates competitive results against SOTA baselines across several benchmarks. The work highlights practical gains from grouping related variates and balancing their influence, while noting limitations in fully leveraging cross-variate information and the reliance on specific linear heads, with future directions toward integration with more architectures and dynamic clustering.

Abstract

Accurate forecasting of multivariate time series data is important in many engineering and scientific applications. Recent state-of-the-art works ignore the inter-relations between variates, using their model on each variate independently. This raises several research questions related to proper modeling of multivariate data. In this work, we propose to view multivariate forecasting as a multi-task learning problem, facilitating the analysis of forecasting by considering the angle between task gradients and their balance. To do so, we analyze linear models to characterize the behavior of tasks. Our analysis suggests that tasks can be defined by grouping similar variates together, which we achieve via a simple clustering that depends on correlation-based similarities. Moreover, to balance tasks, we scale gradients with respect to their prediction error. Then, each task is solved with a linear model within our MTLinear framework. We evaluate our approach on challenging benchmarks in comparison to strong baselines, and we show it obtains on-par or better results on multivariate forecasting problems. The implementation is available at: https://github.com/azencot-group/MTLinear

A Multi-Task Learning Approach to Linear Multivariate Forecasting

TL;DR

This paper reframes multivariate time series forecasting as a multi-task learning problem to exploit inter-variate relationships. It analyzes linear TSF gradients to reveal that task directions align with variate directions and that gradient magnitudes scale with prediction error, motivating a correlation-based variate grouping and an error-aware gradient scaling. The proposed MTLinear framework uses a multi-head linear model with one head per variate group, enhanced by a simple, constant-time gradient penalty, and demonstrates competitive results against SOTA baselines across several benchmarks. The work highlights practical gains from grouping related variates and balancing their influence, while noting limitations in fully leveraging cross-variate information and the reliance on specific linear heads, with future directions toward integration with more architectures and dynamic clustering.

Abstract

Accurate forecasting of multivariate time series data is important in many engineering and scientific applications. Recent state-of-the-art works ignore the inter-relations between variates, using their model on each variate independently. This raises several research questions related to proper modeling of multivariate data. In this work, we propose to view multivariate forecasting as a multi-task learning problem, facilitating the analysis of forecasting by considering the angle between task gradients and their balance. To do so, we analyze linear models to characterize the behavior of tasks. Our analysis suggests that tasks can be defined by grouping similar variates together, which we achieve via a simple clustering that depends on correlation-based similarities. Moreover, to balance tasks, we scale gradients with respect to their prediction error. Then, each task is solved with a linear model within our MTLinear framework. We evaluate our approach on challenging benchmarks in comparison to strong baselines, and we show it obtains on-par or better results on multivariate forecasting problems. The implementation is available at: https://github.com/azencot-group/MTLinear

Paper Structure

This paper contains 38 sections, 12 equations, 11 figures, 11 tables.

Figures (11)

  • Figure 1: The total number of conflicts as a function of epochs. Colored lines represent variate pairs. Pairs with a higher absolute correlation (shown in legend) tend to have fewer conflicts during training.
  • Figure 2: Our pipeline consists of three steps: estimating variate correlations, variate clustering, and assigning a linear module per group. The resulting framework, MTLinear solves multivariate TSF effectively.
  • Figure 3: MSE results for different lookback lengths with a forecast horizon of $96$.
  • Figure 4: MSE measures for different clustering $\bar{\alpha}$. The red dashed line is the mean for iTransformer. The results suggest that MTLinear is comparable or better in comparison to iTransformer.
  • Figure 5: We plot the error $e_{i,j}$ of a given loss vs. its gradient's magnitude. These results highlight the clear positive correlation between the two for both DLinear and NLinear.
  • ...and 6 more figures