Table of Contents
Fetching ...

IPatch: A Multi-Resolution Transformer Architecture for Robust Time-Series Forecasting

Aymane Harkati, Moncef Garouani, Olivier Teste, Julien Aligon, Mohamed Hamlich

Abstract

Accurate forecasting of multivariate time series remains challenging due to the need to capture both short-term fluctuations and long-range temporal dependencies. Transformer-based models have emerged as a powerful approach, but their performance depends critically on the representation of temporal data. Traditional point-wise representations preserve individual time-step information, enabling fine-grained modeling, yet they tend to be computationally expensive and less effective at modeling broader contextual dependencies, limiting their scalability to long sequences. Patch-wise representations aggregate consecutive steps into compact tokens to improve efficiency and model local temporal dynamics, but they often discard fine-grained temporal details that are critical for accurate predictions in volatile or complex time series. We propose IPatch, a multi-resolution Transformer architecture that integrates both point-wise and patch-wise tokens, modeling temporal information at multiple resolutions. Experiments on 7 benchmark datasets demonstrate that IPatch consistently improves forecasting accuracy, robustness to noise, and generalization across various prediction horizons compared to single-representation baselines.

IPatch: A Multi-Resolution Transformer Architecture for Robust Time-Series Forecasting

Abstract

Accurate forecasting of multivariate time series remains challenging due to the need to capture both short-term fluctuations and long-range temporal dependencies. Transformer-based models have emerged as a powerful approach, but their performance depends critically on the representation of temporal data. Traditional point-wise representations preserve individual time-step information, enabling fine-grained modeling, yet they tend to be computationally expensive and less effective at modeling broader contextual dependencies, limiting their scalability to long sequences. Patch-wise representations aggregate consecutive steps into compact tokens to improve efficiency and model local temporal dynamics, but they often discard fine-grained temporal details that are critical for accurate predictions in volatile or complex time series. We propose IPatch, a multi-resolution Transformer architecture that integrates both point-wise and patch-wise tokens, modeling temporal information at multiple resolutions. Experiments on 7 benchmark datasets demonstrate that IPatch consistently improves forecasting accuracy, robustness to noise, and generalization across various prediction horizons compared to single-representation baselines.
Paper Structure (21 sections, 18 equations, 4 figures, 5 tables)

This paper contains 21 sections, 18 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: (a) The proposed dual-stream architecture. The input sequence $X = (x_{i,j})_{1 \leq i \leq L,\; 1 \leq j \leq M}$ is first divided into $N$ patches and combined with positional encoding. It is then passed into a Transformer Encoder block for patch attention and an Autocorrelation block for inner-patch autocorrelation. The outputs of both blocks are concatenated and flattened, then passed through a Linear layer to generate the final prediction $\hat{X} = (\hat{x}_{i,j})_{L+1 \leq i \leq L+H,\; 1 \leq j \leq M}$. (b) Detailed view of the Transformer Encoder attention and the Autocorrelation block.
  • Figure 2: Visual comparison of 96-step forecasts on the Electricity dataset. IPatch (red) tracks the ground truth (black) with markedly higher fidelity than PatchTST, TimeMixer, Crossformer, and DLinear.
  • Figure 3: Predictive effectiveness vs efficiency on ILI dataset (forecast horizon 60). IPatch (red circle) offers the best performance-efficiency tradeoff among all evaluated architectures
  • Figure 4: Qualitative comparison of forecasting models (IPatch, TimeMixer, Crossformer, DLinear, and PatchTST) on different datasets. IPatch shows consistently stronger temporal modeling across diverse time series characteristics.