Identifiable Autoregressive Variational Autoencoders for Nonlinear and Nonstationary Spatio-Temporal Blind Source Separation
Mika Sipilä, Klaus Nordhausen, Sara Taskinen
TL;DR
This work tackles nonlinear spatio-temporal blind source separation under nonstationarity by introducing iVAEar, an identifiable autoregressive variational autoencoder. The model explicitly captures latent components that evolve as nonstationary autoregressive processes and leverages previous observations as auxiliary information to achieve identifiability within an exponential-family latent framework. Theoretical results establish conditions for affine and block-affine identifiability, with stronger guarantees for Gaussian AR latents, while simulations and a real-case air-quality study demonstrate improved latent recovery and multivariate forecasting over state-of-the-art baselines. The approach yields practical benefits for environmental and meteorological applications by enabling interpretable latent structure and accurate predictions, albeit with limitations tied to the autoregressive and Gaussian assumptions and potential extensions to nonseparable models. Overall, iVAEar advances robust latent component estimation and prediction for complex, nonstationary spatio-temporal data.
Abstract
The modeling and prediction of multivariate spatio-temporal data involve numerous challenges. Dimension reduction methods can significantly simplify this process, provided that they account for the complex dependencies between variables and across time and space. Nonlinear blind source separation has emerged as a promising approach, particularly following recent advances in identifiability results. Building on these developments, we introduce the identifiable autoregressive variational autoencoder, which ensures the identifiability of latent components consisting of nonstationary autoregressive processes. The blind source separation efficacy of the proposed method is showcased through a simulation study, where it is compared against state-of-the-art methods, and the spatio-temporal prediction performance is evaluated against several competitors on air pollution and weather datasets.
