CaRiNG: Learning Temporal Causal Representation under Non-Invertible Generation Process

Guangyi Chen; Yifan Shen; Zhenhao Chen; Xiangchen Song; Yuewen Sun; Weiran Yao; Xiao Liu; Kun Zhang

CaRiNG: Learning Temporal Causal Representation under Non-Invertible Generation Process

Guangyi Chen, Yifan Shen, Zhenhao Chen, Xiangchen Song, Yuewen Sun, Weiran Yao, Xiao Liu, Kun Zhang

TL;DR

This work tackles learning temporal causal representations from time series data when the data-generation process is non-invertible. It introduces CaRiNG, a principled approach that uses temporal context and a normalizing-flow-based transition prior to recover lost latent information and achieve identifiability up to permutation and component-wise invertible transforms. The authors establish an identifiability theorem for non-invertible generation and demonstrate CaRiNG’s effectiveness on synthetic datasets and a real-world traffic video QA task, showing improved latent identifiability, disentanglement, and reasoning capabilities. The results suggest that CaRiNG enables more transparent and causally meaningful representations in complex, real-world sequences where traditional nonlinear ICA assumptions fail.

Abstract

Identifying the underlying time-delayed latent causal processes in sequential data is vital for grasping temporal dynamics and making downstream reasoning. While some recent methods can robustly identify these latent causal variables, they rely on strict assumptions about the invertible generation process from latent variables to observed data. However, these assumptions are often hard to satisfy in real-world applications containing information loss. For instance, the visual perception process translates a 3D space into 2D images, or the phenomenon of persistence of vision incorporates historical data into current perceptions. To address this challenge, we establish an identifiability theory that allows for the recovery of independent latent components even when they come from a nonlinear and non-invertible mix. Using this theory as a foundation, we propose a principled approach, CaRiNG, to learn the CAusal RepresentatIon of Non-invertible Generative temporal data with identifiability guarantees. Specifically, we utilize temporal context to recover lost latent information and apply the conditions in our theory to guide the training process. Through experiments conducted on synthetic datasets, we validate that our CaRiNG method reliably identifies the causal process, even when the generation process is non-invertible. Moreover, we demonstrate that our approach considerably improves temporal understanding and reasoning in practical applications.

CaRiNG: Learning Temporal Causal Representation under Non-Invertible Generation Process

TL;DR

Abstract

Paper Structure (44 sections, 8 theorems, 44 equations, 6 figures, 9 tables)

This paper contains 44 sections, 8 theorems, 44 equations, 6 figures, 9 tables.

Introduction
Problem Setup
Non-invertible Temporal Generative Process
Identification of the Latent Causal Processes
Illustrations of the Problem Setup
Identifiability Theory
Identifiability under Non-Invertible Generative Process
Continuity for Permutation Invariance
Approach
Experiments
Simulation Experiments
Real-world Experiments
Conclusion
Identifiability Theory
Proof for Theorem 1
...and 29 more sections

Key Result

Theorem 1

For a series of observations $\mathbf{x}_t\in\mathbb{R}^d$ and estimated latent variables $\mathbf{\hat{z}}_{t}\in\mathbb{R}^n$, suppose there exists function $\mathbf{\hat{g}},\mathbf{\hat{m}}$ which is subject to observational equivalence, If assumptions are satisfied, then $\mathbf{z}_{t}$ must be a component-wise transformation of a permuted version of $\mathbf{\hat{z}}_{t}$ with regard to c

Figures (6)

Figure 1: Motivations of the non-invertible generation process. (a) The occlusions raise the non-invertibility since the measured observation cannot cover the obstructed objects. (b) The vision persistence, shown with the high-speed movement of a crashing car, describes the generation process that jointly involves the current state and previous, and causes the non-invertibility. (c) The identifiability of conventional methods, such as TDRL yao2022temporally (blue), drops drastically with the increase of non-invertibility, while the identifiability of our method (marked in orange) still holds. The levels of non-invertibility are defined by removing $0, 1/3$, and $2/3$ dimensions of $\bf{z}_t$ when generating $\bf{x}_t$. For example, when the dimension of $\bf{z}_t$ is 6, $2/3$ non-invertibility means that we remove 4 variables of $\bf{z}_t$ and use only 2 variables to generate $\bf{x}_t$.
Figure 2: An intuitive illustration of a moving football with a visual persistence effect. Considering the generating process $\mathbf{x}_t=\mathbf{g}(\mathbf{z}_{t:t-r})$, $\mathbf{x}_t$ denotes the observed football with motion blur, and $\mathbf{z}_t$ denotes the position and phase of the ball. Recovering the latent variables from a single observation will be difficult, which introduces non-invertibility.
Figure 3: The overall framework of CaRiNG. It consists of three main modules, including the sequence-to-step encoder, step-to-step decoder, and the transition prior module, which is represented as $\text{SeqEnc}$, $\text{StepDec}$, and $\hat{\mathbf{f}}^{-1}_{\mathbf{z}}$ in a different color, respectively. The model is trained with both $\mathcal{L}_{Recon}$ and $\mathcal{L}_{KLD}$.
Figure 4: Qualitative comparisons between baselines (especially TDRL) and CaRiNG in the setting of Non-invertible Generation. (a) MCC matrix for all 3 latent variables; (b) The scatter plots between the estimated and ground-truth latent variables (only the aligned variables are plot); (c) The validation MCC curves of CaRiNG and other baselines.
Figure A1: Qualitative results on SUTD-TrafficQA dataset. We provide some positive examples and also fail cases to analyze our model.
...and 1 more figures

Theorems & Definitions (16)

Definition 1: Identifiable Latent Causal Process
Theorem 1: Identifiability under Non-invertible Generative Process
Definition 2: Permutation Invariance
Definition 3: Partially Invertiblility
Definition 4: Non-degeneracy Condition of Partially Invertible Functions
Lemma 1: Disentanglement with Continuity
Lemma 2: Disentanglement with Continuity under Side Information
Proposition 1
Theorem A1: Identifiability under Non-invertible Generative Process
proof
...and 6 more

CaRiNG: Learning Temporal Causal Representation under Non-Invertible Generation Process

TL;DR

Abstract

CaRiNG: Learning Temporal Causal Representation under Non-Invertible Generation Process

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (16)