Vision-Enhanced Time Series Forecasting via Latent Diffusion Models

Weilin Ruan; Siru Zhong; Haomin Wen; Yuxuan Liang

Vision-Enhanced Time Series Forecasting via Latent Diffusion Models

Weilin Ruan, Siru Zhong, Haomin Wen, Yuxuan Liang

TL;DR

This work tackles uncertainty-aware long-horizon time series forecasting by reframing forecasting as image reconstruction in a latent diffusion framework. It introduces LDM4TS, which transforms time series into multi-view visual representations (SEG, GAF, RP), encodes these via a frozen latent diffusion model guided by cross-modal conditioning (frequency and text signals), and fuses global and local temporal cues through a temporal projection module. The approach yields state-of-the-art results across diverse datasets, including strong performance in long-term, few-shot, and zero-shot settings, with substantial MSE improvements over competitive baselines. By leveraging vision encoders and probabilistic diffusion in latent space, LDM4TS provides robust uncertainty quantification and scalable forecasting, offering a new pathway for cross-modal temporal modeling in real-world applications.

Abstract

Diffusion models have recently emerged as powerful frameworks for generating high-quality images. While recent studies have explored their application to time series forecasting, these approaches face significant challenges in cross-modal modeling and transforming visual information effectively to capture temporal patterns. In this paper, we propose LDM4TS, a novel framework that leverages the powerful image reconstruction capabilities of latent diffusion models for vision-enhanced time series forecasting. Instead of introducing external visual data, we are the first to use complementary transformation techniques to convert time series into multi-view visual representations, allowing the model to exploit the rich feature extraction capabilities of the pre-trained vision encoder. Subsequently, these representations are reconstructed using a latent diffusion model with a cross-modal conditioning mechanism as well as a fusion module. Experimental results demonstrate that LDM4TS outperforms various specialized forecasting models for time series forecasting tasks.

Vision-Enhanced Time Series Forecasting via Latent Diffusion Models

TL;DR

Abstract

Vision-Enhanced Time Series Forecasting via Latent Diffusion Models

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (11)