Table of Contents
Fetching ...

Perceiver-based CDF Modeling for Time Series Forecasting

Cat P. Le, Chris Cannella, Ali Hasan, Yuting Ng, Vahid Tarokh

TL;DR

This work introduces perceiver-CDF, a scalable framework for multimodal time-series forecasting that jointly handles missing data and irregular sampling. By embedding a perceiver-based encoder to compress high-dimensional inputs into a latent space and coupling it with a copula-based decoder, the model captures conditional and joint distributions efficiently, achieving sub-quadratic complexity. The addition of midpoint inference for local attention and an output-variance testing mechanism mitigates error propagation, yielding robust predictions. Across unimodal and multimodal benchmarks, perceiver-CDF delivers about 20% superior performance to state-of-the-art methods while using less than half the computational resources, highlighting its practical potential for large-scale, heterogeneous time-series forecasting.

Abstract

Transformers have demonstrated remarkable efficacy in forecasting time series data. However, their extensive dependence on self-attention mechanisms demands significant computational resources, thereby limiting their practical applicability across diverse tasks, especially in multimodal problems. In this work, we propose a new architecture, called perceiver-CDF, for modeling cumulative distribution functions (CDF) of time series data. Our approach combines the perceiver architecture with a copula-based attention mechanism tailored for multimodal time series prediction. By leveraging the perceiver, our model efficiently transforms high-dimensional and multimodal data into a compact latent space, thereby significantly reducing computational demands. Subsequently, we implement a copula-based attention mechanism to construct the joint distribution of missing data for prediction. Further, we propose an output variance testing mechanism to effectively mitigate error propagation during prediction. To enhance efficiency and reduce complexity, we introduce midpoint inference for the local attention mechanism. This enables the model to efficiently capture dependencies within nearby imputed samples without considering all previous samples. The experiments on the unimodal and multimodal benchmarks consistently demonstrate a 20% improvement over state-of-the-art methods while utilizing less than half of the computational resources.

Perceiver-based CDF Modeling for Time Series Forecasting

TL;DR

This work introduces perceiver-CDF, a scalable framework for multimodal time-series forecasting that jointly handles missing data and irregular sampling. By embedding a perceiver-based encoder to compress high-dimensional inputs into a latent space and coupling it with a copula-based decoder, the model captures conditional and joint distributions efficiently, achieving sub-quadratic complexity. The addition of midpoint inference for local attention and an output-variance testing mechanism mitigates error propagation, yielding robust predictions. Across unimodal and multimodal benchmarks, perceiver-CDF delivers about 20% superior performance to state-of-the-art methods while using less than half the computational resources, highlighting its practical potential for large-scale, heterogeneous time-series forecasting.

Abstract

Transformers have demonstrated remarkable efficacy in forecasting time series data. However, their extensive dependence on self-attention mechanisms demands significant computational resources, thereby limiting their practical applicability across diverse tasks, especially in multimodal problems. In this work, we propose a new architecture, called perceiver-CDF, for modeling cumulative distribution functions (CDF) of time series data. Our approach combines the perceiver architecture with a copula-based attention mechanism tailored for multimodal time series prediction. By leveraging the perceiver, our model efficiently transforms high-dimensional and multimodal data into a compact latent space, thereby significantly reducing computational demands. Subsequently, we implement a copula-based attention mechanism to construct the joint distribution of missing data for prediction. Further, we propose an output variance testing mechanism to effectively mitigate error propagation during prediction. To enhance efficiency and reduce complexity, we introduce midpoint inference for the local attention mechanism. This enables the model to efficiently capture dependencies within nearby imputed samples without considering all previous samples. The experiments on the unimodal and multimodal benchmarks consistently demonstrate a 20% improvement over state-of-the-art methods while utilizing less than half of the computational resources.
Paper Structure (19 sections, 6 equations, 7 figures, 4 tables)

This paper contains 19 sections, 6 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: The overview architecture of perceiver-CDF model. The pre-processor includes input embedding, and positional encoding layers to capture temporal dependencies in the input data. The encoder uses the cross-attention mechanisms to map the embedding to a lower-dimensional latent space. The decoder constructs the joint distribution of missing data using the copula-based structure.
  • Figure 2: (a) Visualization of the midpoint inference mechanism: blue-filled points represent the points earmarked for inference at a particular depth, while black points represent those already observed or inferred at that depth and the white points are unobserved. (b) Comparison between the global attention mechanism and the local attention mechanism, which utilizes a local window containing only the nearest tokens: green-filled points indicate the currently sampled variable, while red points signify the variables to which the sampled token directs its attention during the sampling process.
  • Figure 3: Comparison of memory consumption of perceiver-CDF model (our), TACTiS model, TACTiS model with perceiver-based encoder (TACTiS-PE), and TACTiS model with midpoint imputation (TACTiS-MI) on a synthetic dataset with the varying prediction length (left figure) and the varying conditioning length (right figure).
  • Figure 4: The predicted samples by the perceiver-CDF (left) and TACTiS (right) for two-year forecasts, corresponding to $24$ time-steps, conditioned on two-year historical data in fred-md dataset.
  • Figure 5: The predicted samples by the perceiver-CDF (left) and TACTiS (right) for two-day forecasts, corresponding to $48$ time-steps, conditioned on two-day historical data in traffic dataset.
  • ...and 2 more figures