Table of Contents
Fetching ...

Glocal Information Bottleneck for Time Series Imputation

Jie Yang, Kexin Zhang, Guibin Zhang, Philip S. Yu, Kaize Ding

TL;DR

Time Series Imputation often fails to generalize under high missingness due to models overfitting local noise and distorting latent structure. The authors propose Glocal-IB, an IB-inspired framework that adds a Global Alignment loss to align masked inputs with their fully observed counterparts, while simultaneously minimizing $I(Z;X^{\text{o}})$ and maximizing $I(X;Z)$ through a trio of losses: $\mathcal{L}_{\text{Reg}}$, $\mathcal{L}_{\text{Loc}}$, and $\mathcal{L}_{\text{Glo}}$. The Global Alignment term uses a light, contrastive-style objective with a density-ratio proxy and a single MLP, enabling model-agnostic integration with encoder–decoder backbones. Extensive experiments on nine real-world datasets across missingness levels (10%–90%) show Glocal-IB delivers substantial imputation gains and more coherent latent spaces, with consistent improvements across backbones and robust performance under challenging missing patterns.

Abstract

Time Series Imputation (TSI), which aims to recover missing values in temporal data, remains a fundamental challenge due to the complex and often high-rate missingness in real-world scenarios. Existing models typically optimize the point-wise reconstruction loss, focusing on recovering numerical values (local information). However, we observe that under high missing rates, these models still perform well in the training phase yet produce poor imputations and distorted latent representation distributions (global information) in the inference phase. This reveals a critical optimization dilemma: current objectives lack global guidance, leading models to overfit local noise and fail to capture global information of the data. To address this issue, we propose a new training paradigm, Glocal Information Bottleneck (Glocal-IB). Glocal-IB is model-agnostic and extends the standard IB framework by introducing a Global Alignment loss, derived from a tractable mutual information approximation. This loss aligns the latent representations of masked inputs with those of their originally observed counterparts. It helps the model retain global structure and local details while suppressing noise caused by missing values, giving rise to better generalization under high missingness. Extensive experiments on nine datasets confirm that Glocal-IB leads to consistently improved performance and aligned latent representations under missingness. Our code implementation is available in https://github.com/Muyiiiii/NeurIPS-25-Glocal-IB.

Glocal Information Bottleneck for Time Series Imputation

TL;DR

Time Series Imputation often fails to generalize under high missingness due to models overfitting local noise and distorting latent structure. The authors propose Glocal-IB, an IB-inspired framework that adds a Global Alignment loss to align masked inputs with their fully observed counterparts, while simultaneously minimizing and maximizing through a trio of losses: , , and . The Global Alignment term uses a light, contrastive-style objective with a density-ratio proxy and a single MLP, enabling model-agnostic integration with encoder–decoder backbones. Extensive experiments on nine real-world datasets across missingness levels (10%–90%) show Glocal-IB delivers substantial imputation gains and more coherent latent spaces, with consistent improvements across backbones and robust performance under challenging missing patterns.

Abstract

Time Series Imputation (TSI), which aims to recover missing values in temporal data, remains a fundamental challenge due to the complex and often high-rate missingness in real-world scenarios. Existing models typically optimize the point-wise reconstruction loss, focusing on recovering numerical values (local information). However, we observe that under high missing rates, these models still perform well in the training phase yet produce poor imputations and distorted latent representation distributions (global information) in the inference phase. This reveals a critical optimization dilemma: current objectives lack global guidance, leading models to overfit local noise and fail to capture global information of the data. To address this issue, we propose a new training paradigm, Glocal Information Bottleneck (Glocal-IB). Glocal-IB is model-agnostic and extends the standard IB framework by introducing a Global Alignment loss, derived from a tractable mutual information approximation. This loss aligns the latent representations of masked inputs with those of their originally observed counterparts. It helps the model retain global structure and local details while suppressing noise caused by missing values, giving rise to better generalization under high missingness. Extensive experiments on nine datasets confirm that Glocal-IB leads to consistently improved performance and aligned latent representations under missingness. Our code implementation is available in https://github.com/Muyiiiii/NeurIPS-25-Glocal-IB.

Paper Structure

This paper contains 28 sections, 22 equations, 15 figures, 2 tables.

Figures (15)

  • Figure 1: Illustration of optimization dilemma in TSI. We visualize the latent space of two representative models—TimesNet (a-b) and GPVAE (c)—trained under different missing rates and training epochs. Training and test losses are shown in green and orange boxes, respectively.
  • Figure 2: Framework comparison of three TSI training paradigms. Three paradigms differ in how to deal with the latent representations and how the key encoder and decoder are updated. (a): The encoder and decoder are updated end-to-end by back-propagation of reconstruction loss. (b): The latent representations are aligned with a frozen time series foundation model with original data. (c): Glocal-IB utilizes the encoder itself and a KL divergence to regularize the latent representations.
  • Figure 3: Imputation performance on the ETTh1 dataset of Transformer, TimesNet, and SAITS with four different training methods.
  • Figure 4: Latent space of SAITS and TimesNet with Glocal-IB on the ETTh1 dataset. Comparison with original models is in the Appendix \ref{['app:FullComparisonResults']}.
  • Figure 5: Different Missing Pattern Imputation and Efficiency results. (a, b): Comparison of imputation performance on the ETTh2 dataset with 50% Point and Block missing rates. Additional results for various missing rates are presented in Appendix \ref{['app:MissingPattern']}. (c, d): Efficiency comparison of four representative models on the ETTh1 dataset, evaluating the original models against their variants with Glocal-IB and foundation model alignment (Time-MoE). The radial axes are on a logarithmic scale.
  • ...and 10 more figures