MixLinear: Extreme Low Resource Multivariate Time Series Forecasting with 0.1K Parameters

Aitian Ma; Dongsheng Luo; Mo Sha

MixLinear: Extreme Low Resource Multivariate Time Series Forecasting with 0.1K Parameters

Aitian Ma, Dongsheng Luo, Mo Sha

TL;DR

This paper introduces MixLinear, an ultra-lightweight, dual-domain framework for extreme low-resource multivariate time series forecasting. It jointly models local trends via a time-domain segment-based, factorized linear pathway and global patterns via an adaptive low-rank spectral filtering pathway in the frequency domain, achieving $O(n)$ time and $O(n)$ space complexity compared with quadratic-attention baselines. The model uses only about $0.1K$ parameters and demonstrates competitive or superior forecasting accuracy across eight benchmark LTSF datasets, with substantial improvements in inference speed on both low- and high-dimensional data. This work enables practical deployment of long-horizon forecasts on edge devices and suggests a general design principle: processing time-series patterns in their most natural domain (time for local, frequency for global) to maximize efficiency without sacrificing accuracy.

Abstract

Recently, there has been a growing interest in Long-term Time Series Forecasting (LTSF), which involves predicting long-term future values by analyzing a large amount of historical time-series data to identify patterns and trends. There exist significant challenges in LTSF due to its complex temporal dependencies and high computational demands. Although Transformer-based models offer high forecasting accuracy, they are often too compute-intensive to be deployed on devices with hardware constraints. On the other hand, the linear models aim to reduce the computational overhead by employing either decomposition methods in the time domain or compact representations in the frequency domain. In this paper, we propose MixLinear, an ultra-lightweight multivariate time series forecasting model specifically designed for resource-constrained devices. MixLinear effectively captures both temporal and frequency domain features by modeling intra-segment and inter-segment variations in the time domain and extracting frequency variations from a low-dimensional latent space in the frequency domain. By reducing the parameter scale of a downsampled $n$-length input/output one-layer linear model from $O(n^2)$ to $O(n)$, MixLinear achieves efficient computation without sacrificing accuracy. Extensive evaluations with four benchmark datasets show that MixLinear attains forecasting performance comparable to, or surpassing, state-of-the-art models with significantly fewer parameters ($0.1K$), which makes it well-suited for deployment on devices with limited computational capacity.

MixLinear: Extreme Low Resource Multivariate Time Series Forecasting with 0.1K Parameters

TL;DR

time and

space complexity compared with quadratic-attention baselines. The model uses only about

parameters and demonstrates competitive or superior forecasting accuracy across eight benchmark LTSF datasets, with substantial improvements in inference speed on both low- and high-dimensional data. This work enables practical deployment of long-horizon forecasts on edge devices and suggests a general design principle: processing time-series patterns in their most natural domain (time for local, frequency for global) to maximize efficiency without sacrificing accuracy.

Abstract

-length input/output one-layer linear model from

, MixLinear achieves efficient computation without sacrificing accuracy. Extensive evaluations with four benchmark datasets show that MixLinear attains forecasting performance comparable to, or surpassing, state-of-the-art models with significantly fewer parameters (

), which makes it well-suited for deployment on devices with limited computational capacity.

Paper Structure (43 sections, 9 equations, 14 figures, 6 tables, 1 algorithm)

This paper contains 43 sections, 9 equations, 14 figures, 6 tables, 1 algorithm.

Introduction
MixLinear Design
Preliminary
Framework Overview
Segment-based Trend Extraction
Adaptive Low-Rank Spectral Filtering
Complexity Analysis
Experiment
Experiment Setup
Datasets.
Baselines.
Environment.
Main Results
Runtime Efficiency
Ablation Study
...and 28 more sections

Figures (14)

Figure 1: MixLinear Architecture Overview. Our dual-pathway framework efficiently processes time series data. The Segment-based pathway (top) downsamples input $X \in \mathbb{R}^L$ into segments $X_{seg} \in \mathbb{R}^{L/\pi}$, applies linear transformations for intra-segment (blue) and inter-segment (orange) correlations, then upsamples to $X_T \in \mathbb{R}^H$. The Frequency-domain pathway (bottom) transforms segments via FFT ($X_S \in \mathbb{C}^{L/\pi}$), compresses trends through adaptive low-rank filtering to latent space $Z_S \in \mathbb{C}^{n_z}$, reconstructs via iFFT, and outputs $X_F \in \mathbb{R}^H$. Final predictions $Y \in \mathbb{R}^H$ combine both outputs, achieving competitive forecasting with only 0.1K parameters.
Figure 2: Parameter count comparisons across different look-back windows on Electricity dataset. MixLinear demonstrates consistently better parameter efficiency compared to SparseTSF and FITS across all configurations.
Figure 3: Comparisons on MSE among various LTSF models at forecast horizon 720.
Figure 4: Inference time among efficient LTSF models in low- and High-Dimensional scenarios.
Figure 5: Impact of segment length on MACs and MSE of MixLinear at forecast horizon 720.
...and 9 more figures

MixLinear: Extreme Low Resource Multivariate Time Series Forecasting with 0.1K Parameters

TL;DR

Abstract

MixLinear: Extreme Low Resource Multivariate Time Series Forecasting with 0.1K Parameters

Authors

TL;DR

Abstract

Table of Contents

Figures (14)