Table of Contents
Fetching ...

SSL Framework for Causal Inconsistency between Structures and Representations

Hang Chen, Xinyu Yang, Keqing Du, Wenya Wang

TL;DR

This work identifies causal inconsistency that arises when learning causal structure and representation from Indefinite Data, where data exhibit multiple causal structures and complex variables. It proposes an intervention-based self-supervised learning framework that treats different interventions as views to jointly align the learned structure $\hat{\mathcal{G}}$ and representation $\hat{X}$ by minimizing discrepancies across views with learnable augmentations. The framework demonstrably improves causal consistency and downstream task performance on CDID benchmarks and is extended to emotion recognition, emotion-cause extraction, and temporal action segmentation, with qualitative evidence showing more coherent adjacency graphs. It also explores LLM-based causal inference via iterative instructions, achieving substantial gains in causal relation identification, suggesting practical impact for robust causal learning on complex, real-world data.

Abstract

The cross-pollination between causal discovery and deep learning has led to increasingly extensive interactions. It results in a large number of deep learning data types (such as images, text, etc.) extending into the field of causal discovery, and a multitude of deep learning tasks have begun to utilize causal discovery to explore the internal causal structure and causal representation of data. In this paper, we first identified that a complex data type, ``Indefinite Data", has conflicts between causal relationships expressed by the causal structure and causal representation generated by deep learning models, a phenomenon referred to as causal inconsistency. We thoroughly analyzed related work to explain why only Indefinite Data exhibits causal inconsistency while other data types do not. Furthermore, to alleviate causal inconsistency, we proposed a self-supervised learning (SSL) framework based on intervention, hoping to provide more causal information from different intervention views to promote consistency between structure and representation. Extensive experiments have shown that the SSL framework enhances causal consistency and can further improve causal structure and representation learning performance. Additionally, we extended the SSL framework to three different downstream tasks and LLM instructions. The quantitative results of these applications all reflect the performance improvement brought about by causal consistency.

SSL Framework for Causal Inconsistency between Structures and Representations

TL;DR

This work identifies causal inconsistency that arises when learning causal structure and representation from Indefinite Data, where data exhibit multiple causal structures and complex variables. It proposes an intervention-based self-supervised learning framework that treats different interventions as views to jointly align the learned structure and representation by minimizing discrepancies across views with learnable augmentations. The framework demonstrably improves causal consistency and downstream task performance on CDID benchmarks and is extended to emotion recognition, emotion-cause extraction, and temporal action segmentation, with qualitative evidence showing more coherent adjacency graphs. It also explores LLM-based causal inference via iterative instructions, achieving substantial gains in causal relation identification, suggesting practical impact for robust causal learning on complex, real-world data.

Abstract

The cross-pollination between causal discovery and deep learning has led to increasingly extensive interactions. It results in a large number of deep learning data types (such as images, text, etc.) extending into the field of causal discovery, and a multitude of deep learning tasks have begun to utilize causal discovery to explore the internal causal structure and causal representation of data. In this paper, we first identified that a complex data type, ``Indefinite Data", has conflicts between causal relationships expressed by the causal structure and causal representation generated by deep learning models, a phenomenon referred to as causal inconsistency. We thoroughly analyzed related work to explain why only Indefinite Data exhibits causal inconsistency while other data types do not. Furthermore, to alleviate causal inconsistency, we proposed a self-supervised learning (SSL) framework based on intervention, hoping to provide more causal information from different intervention views to promote consistency between structure and representation. Extensive experiments have shown that the SSL framework enhances causal consistency and can further improve causal structure and representation learning performance. Additionally, we extended the SSL framework to three different downstream tasks and LLM instructions. The quantitative results of these applications all reflect the performance improvement brought about by causal consistency.
Paper Structure (51 sections, 2 theorems, 15 equations, 13 figures, 12 tables, 1 algorithm)

This paper contains 51 sections, 2 theorems, 15 equations, 13 figures, 12 tables, 1 algorithm.

Key Result

Proposition 1

General intervention is represented by the $do_{g}$ operator with the objective of setting the parent set of the observed variable to $\emptyset$. where $Pa(x)$ represents a parent set of $x$ in the causal graph.

Figures (13)

  • Figure 1: Input-output Framework of interest.
  • Figure 2: Comparison between single-structure data and multi-structure data.
  • Figure 3: Generative framework to solve multi-structure and simple-variable data.
  • Figure 4: Generative framework to solve single-structure and complex-variable data.
  • Figure 5: Sample from Causalogue dataset, applied in our input-output framework.
  • ...and 8 more figures

Theorems & Definitions (5)

  • Definition 1: Indefinite Data
  • Proposition 1: General Intervention
  • Definition 2: $\tau$-transformation
  • Definition 3: Causal Model
  • Theorem 1: Causal Consistency Condition (CCC)