Table of Contents
Fetching ...

Self-attention-based Diffusion Model for Time-series Imputation in Partial Blackout Scenarios

Mohammad Rafid Ul Islam, Prasad Tadepalli, Alan Fern

TL;DR

This paper tackles missing data in multivariate time-series by introducing partial blackout, a broad missing-pattern where subsets of features are unavailable across time. It presents SADI, a diffusion-based imputation model that jointly models feature dependencies with a Feature Dependency Encoder and temporal dependencies with Gated Temporal Attention, coupled with a two-stage imputation and a learned weighting scheme. SADI is trained to handle incomplete data via masking and employs RM and MPB training strategies, achieving superior MSE and CRPS on several real-world datasets while using less GPU memory than competitive diffusion-based methods. The approach yields robust, scalable imputations that capture complex inter-feature and temporal relationships, making it practical for high-dimensional, partially observed time-series in real-world settings.

Abstract

Missing values in multivariate time series data can harm machine learning performance and introduce bias. These gaps arise from sensor malfunctions, blackouts, and human error and are typically addressed by data imputation. Previous work has tackled the imputation of missing data in random, complete blackouts and forecasting scenarios. The current paper addresses a more general missing pattern, which we call "partial blackout," where a subset of features is missing for consecutive time steps. We introduce a two-stage imputation process using self-attention and diffusion processes to model feature and temporal correlations. Notably, our model effectively handles missing data during training, enhancing adaptability and ensuring reliable imputation and performance, even with incomplete datasets. Our experiments on benchmark and two real-world time series datasets demonstrate that our model outperforms the state-of-the-art in partial blackout scenarios and shows better scalability.

Self-attention-based Diffusion Model for Time-series Imputation in Partial Blackout Scenarios

TL;DR

This paper tackles missing data in multivariate time-series by introducing partial blackout, a broad missing-pattern where subsets of features are unavailable across time. It presents SADI, a diffusion-based imputation model that jointly models feature dependencies with a Feature Dependency Encoder and temporal dependencies with Gated Temporal Attention, coupled with a two-stage imputation and a learned weighting scheme. SADI is trained to handle incomplete data via masking and employs RM and MPB training strategies, achieving superior MSE and CRPS on several real-world datasets while using less GPU memory than competitive diffusion-based methods. The approach yields robust, scalable imputations that capture complex inter-feature and temporal relationships, making it practical for high-dimensional, partially observed time-series in real-world settings.

Abstract

Missing values in multivariate time series data can harm machine learning performance and introduce bias. These gaps arise from sensor malfunctions, blackouts, and human error and are typically addressed by data imputation. Previous work has tackled the imputation of missing data in random, complete blackouts and forecasting scenarios. The current paper addresses a more general missing pattern, which we call "partial blackout," where a subset of features is missing for consecutive time steps. We introduce a two-stage imputation process using self-attention and diffusion processes to model feature and temporal correlations. Notably, our model effectively handles missing data during training, enhancing adaptability and ensuring reliable imputation and performance, even with incomplete datasets. Our experiments on benchmark and two real-world time series datasets demonstrate that our model outperforms the state-of-the-art in partial blackout scenarios and shows better scalability.

Paper Structure

This paper contains 12 sections, 16 equations, 2 figures, 3 tables, 2 algorithms.

Figures (2)

  • Figure 1: Partial blackouts. Multiple features are missing in consecutive time steps as shown by the empty cells.
  • Figure 2: Architectures of SADI and CSDI models. CSDI uses consecutive transformers to capture feature and temporal dependencies, partitioning features into $K$ instances and time steps into $L$ instances. In contrast, SADI models feature and time dependencies jointly.