Table of Contents
Fetching ...

IN-Flow: Instance Normalization Flow for Non-stationary Time Series Forecasting

Wei Fan, Shun Zheng, Pengyang Wang, Rui Xie, Kun Yi, Qi Zhang, Jiang Bian, Yanjie Fu

TL;DR

Non-stationarity in real-world time series causes distribution shifts that degrade forecasting. The authors propose a model-agnostic, decoupled framework that treats shift removal as a distribution transformation separate from forecasting, and implement it with IN-Flow, a novel invertible network built from instance normalization and coupling layers. They formalize learning as a bi-level optimization that jointly trains the transformation and the forecasting model, enabling robust forecasting under shifting distributions. Extensive experiments on synthetic and six real-world datasets demonstrate consistent, substantial improvements over state-of-the-art non-stationary forecasting methods across multiple backbones, highlighting the practical impact of decoupled transformation learning for time series prediction.

Abstract

Due to the non-stationarity of time series, the distribution shift problem largely hinders the performance of time series forecasting. Existing solutions either rely on using certain statistics to specify the shift, or developing specific mechanisms for certain network architectures. However, the former would fail for the unknown shift beyond simple statistics, while the latter has limited compatibility on different forecasting models. To overcome these problems, we first propose a decoupled formulation for time series forecasting, with no reliance on fixed statistics and no restriction on forecasting architectures. This formulation regards the removing-shift procedure as a special transformation between a raw distribution and a desired target distribution and separates it from the forecasting. Such a formulation is further formalized into a bi-level optimization problem, to enable the joint learning of the transformation (outer loop) and forecasting (inner loop). Moreover, the special requirements of expressiveness and bi-direction for the transformation motivate us to propose instance normalization flow (IN-Flow), a novel invertible network for time series transformation. Different from the classic "normalizing flow" models, IN-Flow does not aim for normalizing input to the prior distribution (e.g., Gaussian distribution) for generation, but creatively transforms time series distribution by stacking normalization layers and flow-based invertible networks, which is thus named "normalization" flow. Finally, we have conducted extensive experiments on both synthetic data and real-world data, which demonstrate the superiority of our method.

IN-Flow: Instance Normalization Flow for Non-stationary Time Series Forecasting

TL;DR

Non-stationarity in real-world time series causes distribution shifts that degrade forecasting. The authors propose a model-agnostic, decoupled framework that treats shift removal as a distribution transformation separate from forecasting, and implement it with IN-Flow, a novel invertible network built from instance normalization and coupling layers. They formalize learning as a bi-level optimization that jointly trains the transformation and the forecasting model, enabling robust forecasting under shifting distributions. Extensive experiments on synthetic and six real-world datasets demonstrate consistent, substantial improvements over state-of-the-art non-stationary forecasting methods across multiple backbones, highlighting the practical impact of decoupled transformation learning for time series prediction.

Abstract

Due to the non-stationarity of time series, the distribution shift problem largely hinders the performance of time series forecasting. Existing solutions either rely on using certain statistics to specify the shift, or developing specific mechanisms for certain network architectures. However, the former would fail for the unknown shift beyond simple statistics, while the latter has limited compatibility on different forecasting models. To overcome these problems, we first propose a decoupled formulation for time series forecasting, with no reliance on fixed statistics and no restriction on forecasting architectures. This formulation regards the removing-shift procedure as a special transformation between a raw distribution and a desired target distribution and separates it from the forecasting. Such a formulation is further formalized into a bi-level optimization problem, to enable the joint learning of the transformation (outer loop) and forecasting (inner loop). Moreover, the special requirements of expressiveness and bi-direction for the transformation motivate us to propose instance normalization flow (IN-Flow), a novel invertible network for time series transformation. Different from the classic "normalizing flow" models, IN-Flow does not aim for normalizing input to the prior distribution (e.g., Gaussian distribution) for generation, but creatively transforms time series distribution by stacking normalization layers and flow-based invertible networks, which is thus named "normalization" flow. Finally, we have conducted extensive experiments on both synthetic data and real-world data, which demonstrate the superiority of our method.
Paper Structure (30 sections, 8 equations, 8 figures, 5 tables)

This paper contains 30 sections, 8 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: Framework overview. The upper part is the proposed decoupled formulation (Section \ref{['sec:method_formu']}): it separates the time series forecasting into the transformation module ($g_\phi$) and the forecasting module ($f_\theta$); the input series with the raw data space is converted into the transformed space by $g_\phi$ for the forecasting conducted by the forecasting module $f_\theta$; then the predicted results will be recovered by the inverse of $g_\phi$ to recovered to the raw data space. The lower part is the bi-level optimization (Section \ref{['sec:method_optim']}) for the decoupled formulation of time series forecasting.
  • Figure 2: The specific architecture of RealNVP dinh2016density and our IN-Flow. We discuss the difference of pre- and post-norm in Section \ref{['sec:exp_ablation']} and take the pre-norm version as IN-Flow.
  • Figure 3: Evaluation of forecasting on the synthetic data. NonST means non-stationary transformer.
  • Figure 4: Learning loss curves with and without IN-Flow.
  • Figure 5: Visualization of long-term 336 steps forecasting results of a test sample in ETTm2 dataset.
  • ...and 3 more figures