MixLinear: Extreme Low Resource Multivariate Time Series Forecasting with 0.1K Parameters
Aitian Ma, Dongsheng Luo, Mo Sha
TL;DR
This paper introduces MixLinear, an ultra-lightweight, dual-domain framework for extreme low-resource multivariate time series forecasting. It jointly models local trends via a time-domain segment-based, factorized linear pathway and global patterns via an adaptive low-rank spectral filtering pathway in the frequency domain, achieving $O(n)$ time and $O(n)$ space complexity compared with quadratic-attention baselines. The model uses only about $0.1K$ parameters and demonstrates competitive or superior forecasting accuracy across eight benchmark LTSF datasets, with substantial improvements in inference speed on both low- and high-dimensional data. This work enables practical deployment of long-horizon forecasts on edge devices and suggests a general design principle: processing time-series patterns in their most natural domain (time for local, frequency for global) to maximize efficiency without sacrificing accuracy.
Abstract
Recently, there has been a growing interest in Long-term Time Series Forecasting (LTSF), which involves predicting long-term future values by analyzing a large amount of historical time-series data to identify patterns and trends. There exist significant challenges in LTSF due to its complex temporal dependencies and high computational demands. Although Transformer-based models offer high forecasting accuracy, they are often too compute-intensive to be deployed on devices with hardware constraints. On the other hand, the linear models aim to reduce the computational overhead by employing either decomposition methods in the time domain or compact representations in the frequency domain. In this paper, we propose MixLinear, an ultra-lightweight multivariate time series forecasting model specifically designed for resource-constrained devices. MixLinear effectively captures both temporal and frequency domain features by modeling intra-segment and inter-segment variations in the time domain and extracting frequency variations from a low-dimensional latent space in the frequency domain. By reducing the parameter scale of a downsampled $n$-length input/output one-layer linear model from $O(n^2)$ to $O(n)$, MixLinear achieves efficient computation without sacrificing accuracy. Extensive evaluations with four benchmark datasets show that MixLinear attains forecasting performance comparable to, or surpassing, state-of-the-art models with significantly fewer parameters ($0.1K$), which makes it well-suited for deployment on devices with limited computational capacity.
