Table of Contents
Fetching ...

Identifiable Autoregressive Variational Autoencoders for Nonlinear and Nonstationary Spatio-Temporal Blind Source Separation

Mika Sipilä, Klaus Nordhausen, Sara Taskinen

TL;DR

This work tackles nonlinear spatio-temporal blind source separation under nonstationarity by introducing iVAEar, an identifiable autoregressive variational autoencoder. The model explicitly captures latent components that evolve as nonstationary autoregressive processes and leverages previous observations as auxiliary information to achieve identifiability within an exponential-family latent framework. Theoretical results establish conditions for affine and block-affine identifiability, with stronger guarantees for Gaussian AR latents, while simulations and a real-case air-quality study demonstrate improved latent recovery and multivariate forecasting over state-of-the-art baselines. The approach yields practical benefits for environmental and meteorological applications by enabling interpretable latent structure and accurate predictions, albeit with limitations tied to the autoregressive and Gaussian assumptions and potential extensions to nonseparable models. Overall, iVAEar advances robust latent component estimation and prediction for complex, nonstationary spatio-temporal data.

Abstract

The modeling and prediction of multivariate spatio-temporal data involve numerous challenges. Dimension reduction methods can significantly simplify this process, provided that they account for the complex dependencies between variables and across time and space. Nonlinear blind source separation has emerged as a promising approach, particularly following recent advances in identifiability results. Building on these developments, we introduce the identifiable autoregressive variational autoencoder, which ensures the identifiability of latent components consisting of nonstationary autoregressive processes. The blind source separation efficacy of the proposed method is showcased through a simulation study, where it is compared against state-of-the-art methods, and the spatio-temporal prediction performance is evaluated against several competitors on air pollution and weather datasets.

Identifiable Autoregressive Variational Autoencoders for Nonlinear and Nonstationary Spatio-Temporal Blind Source Separation

TL;DR

This work tackles nonlinear spatio-temporal blind source separation under nonstationarity by introducing iVAEar, an identifiable autoregressive variational autoencoder. The model explicitly captures latent components that evolve as nonstationary autoregressive processes and leverages previous observations as auxiliary information to achieve identifiability within an exponential-family latent framework. Theoretical results establish conditions for affine and block-affine identifiability, with stronger guarantees for Gaussian AR latents, while simulations and a real-case air-quality study demonstrate improved latent recovery and multivariate forecasting over state-of-the-art baselines. The approach yields practical benefits for environmental and meteorological applications by enabling interpretable latent structure and accurate predictions, albeit with limitations tied to the autoregressive and Gaussian assumptions and potential extensions to nonseparable models. Overall, iVAEar advances robust latent component estimation and prediction for complex, nonstationary spatio-temporal data.

Abstract

The modeling and prediction of multivariate spatio-temporal data involve numerous challenges. Dimension reduction methods can significantly simplify this process, provided that they account for the complex dependencies between variables and across time and space. Nonlinear blind source separation has emerged as a promising approach, particularly following recent advances in identifiability results. Building on these developments, we introduce the identifiable autoregressive variational autoencoder, which ensures the identifiability of latent components consisting of nonstationary autoregressive processes. The blind source separation efficacy of the proposed method is showcased through a simulation study, where it is compared against state-of-the-art methods, and the spatio-temporal prediction performance is evaluated against several competitors on air pollution and weather datasets.

Paper Structure

This paper contains 19 sections, 11 theorems, 49 equations, 4 figures, 2 tables.

Key Result

proposition thmcounterproposition

Assume that the set $(\boldsymbol{f}, \boldsymbol{T}, \boldsymbol{\lambda})$ is identifiable up to block-affine transformation and that the autoregressive order $R=0$. Further assume: Then we have that $\tilde{\boldsymbol{f}}^{-1}(\boldsymbol{x}) = \tilde{\boldsymbol{z}} = \boldsymbol{P} (g_1(z_1), \dots, g_P(z_P))^\top$, where $\boldsymbol{P}$ is a $P \times P$ permutation matrix and $g_1, \dots

Figures (4)

  • Figure 1: Schematic presentation of iVAEar method in $R=1$ case.
  • Figure 2: Mean correlation coefficients from 500 trials for Settings 1-6. The y-axis shows MCC (optimal value = 1), while the x-axis represents different methods. Box colors indicate the number of mixing layers in the mixing function.
  • Figure 3: Mean correlation coefficients of 500 trials for Setting 1 (top) and Setting 5 (bottom) with $R=1$ and $R=3$. The y-axis shows MCC (optimal value = 1), while the x-axis represents different methods. Box colors indicate the number of mixing layers in the mixing function.
  • Figure 4: ELBO for different latent dimensions.

Theorems & Definitions (15)

  • definition thmcounterdefinition
  • proposition thmcounterproposition
  • theorem thmcountertheorem
  • theorem thmcountertheorem
  • proposition thmcounterproposition
  • proposition thmcounterproposition
  • theorem thmcountertheorem
  • definition thmcounterdefinition: Autoregressive models
  • definition thmcounterdefinition: Autoregressive exponential family
  • lemma thmcounterlemma
  • ...and 5 more