Temporally Disentangled Representation Learning under Unknown Nonstationarity
Xiangchen Song, Weiran Yao, Yewen Fan, Xinshuai Dong, Guangyi Chen, Juan Carlos Niebles, Eric Xing, Kun Zhang
TL;DR
This work tackles unsupervised learning of latent causal representations from nonstationary time series with time-delayed influences, where domain shifts are unobserved. It develops identifiability theory by combining nonlinear ICA with Markov-switching nonstationarity and then introduces NCTRL, a three-component framework consisting of an Autoregressive Hidden Markov Module, a Prior Network, and an Encoder-Decoder, to recover time-delayed latent causal variables and their relations solely from observed data. The approach is supported by theoretical identifiability guarantees under mild conditions and demonstrates strong empirical performance on synthetic and real video datasets, surpassing baselines that cannot adequately exploit nonstationarity. This has practical implications for learning interpretable, causally meaningful latent factors in complex sequential data without requiring labeled domain information. The methods enable more transparent downstream analysis and decision-making in domains such as video understanding and behavior analysis.
Abstract
In unsupervised causal representation learning for sequential data with time-delayed latent causal influences, strong identifiability results for the disentanglement of causally-related latent variables have been established in stationary settings by leveraging temporal structure. However, in nonstationary setting, existing work only partially addressed the problem by either utilizing observed auxiliary variables (e.g., class labels and/or domain indexes) as side information or assuming simplified latent causal dynamics. Both constrain the method to a limited range of scenarios. In this study, we further explored the Markov Assumption under time-delayed causally related process in nonstationary setting and showed that under mild conditions, the independent latent components can be recovered from their nonlinear mixture up to a permutation and a component-wise transformation, without the observation of auxiliary variables. We then introduce NCTRL, a principled estimation framework, to reconstruct time-delayed latent causal variables and identify their relations from measured sequential data only. Empirical evaluations demonstrated the reliable identification of time-delayed latent causal influences, with our methodology substantially outperforming existing baselines that fail to exploit the nonstationarity adequately and then, consequently, cannot distinguish distribution shifts.
