Context Matters: Leveraging Contextual Features for Time Series Forecasting
Sameep Chattopadhyay, Pulkit Paliwal, Sai Shankar Narasimhan, Shubhankar Agarwal, Sandeep P. Chinchali
TL;DR
ContextFormer introduces a plug-and-play framework to fuse multimodal contextual metadata into pre-trained time-series forecasters via cross-attention. By coupling historical data with heterogeneous metadata (categorical, continuous, time-varying, and textual) and employing a fine-tuning scheme that freezes the base model, it guarantees at least as good performance as the original forecaster while achieving substantial accuracy gains across diverse domains. The approach is supported by information-theoretic motivation and empirical results showing improvements up to 30% in MSE/MAE over strong baselines, including across synthetic data and real-world datasets like Energy, Traffic, and Finance. This work offers a scalable path to context-aware forecasting without retraining foundational models from scratch, with potential extensions to additional modalities and meta-data forecasting.
Abstract
Time series forecasts are often influenced by exogenous contextual features in addition to their corresponding history. For example, in financial settings, it is hard to accurately predict a stock price without considering public sentiments and policy decisions in the form of news articles, tweets, etc. Though this is common knowledge, the current state-of-the-art (SOTA) forecasting models fail to incorporate such contextual information, owing to its heterogeneity and multimodal nature. To address this, we introduce ContextFormer, a novel plug-and-play method to surgically integrate multimodal contextual information into existing pre-trained forecasting models. ContextFormer effectively distills forecast-specific information from rich multimodal contexts, including categorical, continuous, time-varying, and even textual information, to significantly enhance the performance of existing base forecasters. ContextFormer outperforms SOTA forecasting models by up to 30% on a range of real-world datasets spanning energy, traffic, environmental, and financial domains.
