CaRiNG: Learning Temporal Causal Representation under Non-Invertible Generation Process
Guangyi Chen, Yifan Shen, Zhenhao Chen, Xiangchen Song, Yuewen Sun, Weiran Yao, Xiao Liu, Kun Zhang
TL;DR
This work tackles learning temporal causal representations from time series data when the data-generation process is non-invertible. It introduces CaRiNG, a principled approach that uses temporal context and a normalizing-flow-based transition prior to recover lost latent information and achieve identifiability up to permutation and component-wise invertible transforms. The authors establish an identifiability theorem for non-invertible generation and demonstrate CaRiNG’s effectiveness on synthetic datasets and a real-world traffic video QA task, showing improved latent identifiability, disentanglement, and reasoning capabilities. The results suggest that CaRiNG enables more transparent and causally meaningful representations in complex, real-world sequences where traditional nonlinear ICA assumptions fail.
Abstract
Identifying the underlying time-delayed latent causal processes in sequential data is vital for grasping temporal dynamics and making downstream reasoning. While some recent methods can robustly identify these latent causal variables, they rely on strict assumptions about the invertible generation process from latent variables to observed data. However, these assumptions are often hard to satisfy in real-world applications containing information loss. For instance, the visual perception process translates a 3D space into 2D images, or the phenomenon of persistence of vision incorporates historical data into current perceptions. To address this challenge, we establish an identifiability theory that allows for the recovery of independent latent components even when they come from a nonlinear and non-invertible mix. Using this theory as a foundation, we propose a principled approach, CaRiNG, to learn the CAusal RepresentatIon of Non-invertible Generative temporal data with identifiability guarantees. Specifically, we utilize temporal context to recover lost latent information and apply the conditions in our theory to guide the training process. Through experiments conducted on synthetic datasets, we validate that our CaRiNG method reliably identifies the causal process, even when the generation process is non-invertible. Moreover, we demonstrate that our approach considerably improves temporal understanding and reasoning in practical applications.
