ContiFormer: Continuous-Time Transformer for Irregular Time Series Modeling
Yuqi Chen, Kan Ren, Yansen Wang, Yuchen Fang, Weiwei Sun, Dongsheng Li
TL;DR
ContiFormer introduces a continuous-time Transformer framework that models irregular time series by integrating Neural ODE-like dynamics into attention. By defining latent trajectories per observation and a continuous-time attention mechanism, it captures evolving input-output relationships while enabling parallel computation. The authors prove a universal approximation property showing ContiFormer encompasses vanilla Transformer variants and common irregular-time attention schemes. Empirical results across interpolation, classification, event sequence prediction, and regular forecasting show state-of-the-art or competitive performance, with a trade-off of higher computational cost due to continuous-time processing. This work advances flexible, high-fidelity modeling of continuous-time dynamics in irregular time series, with potential broad impact on domains with asynchronous data.
Abstract
Modeling continuous-time dynamics on irregular time series is critical to account for data evolution and correlations that occur continuously. Traditional methods including recurrent neural networks or Transformer models leverage inductive bias via powerful neural architectures to capture complex patterns. However, due to their discrete characteristic, they have limitations in generalizing to continuous-time data paradigms. Though neural ordinary differential equations (Neural ODEs) and their variants have shown promising results in dealing with irregular time series, they often fail to capture the intricate correlations within these sequences. It is challenging yet demanding to concurrently model the relationship between input data points and capture the dynamic changes of the continuous-time system. To tackle this problem, we propose ContiFormer that extends the relation modeling of vanilla Transformer to the continuous-time domain, which explicitly incorporates the modeling abilities of continuous dynamics of Neural ODEs with the attention mechanism of Transformers. We mathematically characterize the expressive power of ContiFormer and illustrate that, by curated designs of function hypothesis, many Transformer variants specialized in irregular time series modeling can be covered as a special case of ContiFormer. A wide range of experiments on both synthetic and real-world datasets have illustrated the superior modeling capacities and prediction performance of ContiFormer on irregular time series data. The project link is https://seqml.github.io/contiformer/.
