Inner-Instance Normalization for Time Series Forecasting
Zipo Jibao, Yingyi Fu, Xinyang Chen, Guoting Chen
TL;DR
The paper identifies inner-instance distribution shifts as an overlooked challenge in time-series forecasting and introduces two point-level normalization methods, LD and LCD, to address shifts within a single sequence. LD uses a z-score pipeline plus a learnable matrix $A$ (and optionally $B$) to model per-time-step internal distributions, with denormalization via $P$ and $Q$ to produce final predictions. LCD directly learns the conditional distribution by predicting horizon means $\hat{\mu}_y$ and per-time-step scales $s_n$ from the centered lookback, yielding $\hat{y}_n = \tilde{y}_n s_n + \hat{\mu}_y$; it offers linear and attention-based variants. Across multiple backbones and diverse datasets, LD and LCD consistently improve forecasting accuracy and outperform state-of-the-art normalization methods, highlighting the practical value of fine-grained, time-point-level normalization for non-stationary time series.
Abstract
Real-world time series are influenced by numerous factors and exhibit complex non-stationary characteristics. Non-stationarity can lead to distribution shifts, where the statistical properties of time series change over time, negatively impacting model performance. Several instance normalization techniques have been proposed to address distribution shifts in time series forecasting. However, existing methods fail to account for shifts within individual instances, leading to suboptimal performance. To tackle inner-instance distribution shifts, we propose two novel point-level methods: Learning Distribution (LD) and Learning Conditional Distribution (LCD). LD eliminates internal discrepancies by fitting the internal distribution of input and output with different parameters at different time steps, while LCD utilizes neural networks to predict scaling coefficients of the output. We evaluate the performance of the two methods with various backbone models across public benchmarks and demonstrate the effectiveness of the point-level paradigm through comparative experiments.
