Mixture of Low Rank Adaptation with Partial Parameter Sharing for Time Series Forecasting
Licheng Pan, Zhichao Chen, Haoxuan Li, Guangyi Liu, Zhijian Xu, Zhaoran Liu, Hao Wang, Ying Wei
TL;DR
This work identifies an expressiveness bottleneck in multi-task time-series forecasting, where a shared representation limits step-specific predictions. It introduces a two-stage approach: pre-train a one-step foundation model and adapt to multiple horizons using step-specific LoRA modules, thereby mitigating the bottleneck. Building on this, MoLA employs segment-based adaptation and a mixture of LoRA experts to enable partial parameter sharing across horizons, improving both efficiency and accuracy. Extensive experiments across diverse datasets and backbones show MoLA consistently outperforms state-of-the-art TSF methods and robustly generalizes to other models and fine-tuning techniques, signaling strong practical impact for long-horizon forecasting.
Abstract
Multi-task forecasting has become the standard approach for time-series forecasting (TSF). However, we show that it suffers from an Expressiveness Bottleneck, where predictions at different time steps share the same representation, leading to unavoidable errors even with optimal representations. To address this issue, we propose a two-stage framework: first, pre-train a foundation model for one-step-ahead prediction; then, adapt it using step-specific LoRA modules.This design enables the foundation model to handle any number of forecast steps while avoiding the expressiveness bottleneck. We further introduce the Mixture-of-LoRA (MoLA) model, which employs adaptively weighted LoRA experts to achieve partial parameter sharing across steps. This approach enhances both efficiency and forecasting performance by exploiting interdependencies between forecast steps. Experiments show that MoLA significantly improves model expressiveness and outperforms state-of-the-art time-series forecasting methods. Code is available at https://anonymous.4open.science/r/MoLA-BC92.
