Table of Contents
Fetching ...

Learning a Structural Causal Model for Intuition Reasoning in Conversation

Hang Chen, Bingyu Liao, Jing Luo, Wenjing Zhu, Xinyu Yang

TL;DR

This work addresses the challenge of intuitive conversation reasoning by introducing a Conversation Cognitive Model (CCM) that integrates perception, mental state, and plans to explain utterance generation. It algebraically transforms CCM into a Structural Causal Model (SCM), simplifying via latent projection and mediator omission to yield observable Utterances and latent Mentally State influences, enabling a causal representation learned with a variational framework. A graph-attention encoder infers implicit causes, and a decoder with an autoregressive SCM propagates those causes to reconstruct utterances, optimized through an evidence lower bound (ELBO). The authors provide synthetic and simulation datasets with complete causal structures to enable evaluation beyond implicit-cause datasets, and demonstrate state-of-the-art performance on explicit cause extraction (ECE) and implicit cause extraction (ICE) tasks across real, synthetic, and simulated data, while also discussing latent confounding and intervention-based analysis. The approach advances interpretable, causally grounded conversation reasoning with potential applications in affective reasoning, dialogue generation, and robust causal discovery in language tasks.

Abstract

Reasoning, a crucial aspect of NLP research, has not been adequately addressed by prevailing models including Large Language Model. Conversation reasoning, as a critical component of it, remains largely unexplored due to the absence of a well-designed cognitive model. In this paper, inspired by intuition theory on conversation cognition, we develop a conversation cognitive model (CCM) that explains how each utterance receives and activates channels of information recursively. Besides, we algebraically transformed CCM into a structural causal model (SCM) under some mild assumptions, rendering it compatible with various causal discovery methods. We further propose a probabilistic implementation of the SCM for utterance-level relation reasoning. By leveraging variational inference, it explores substitutes for implicit causes, addresses the issue of their unobservability, and reconstructs the causal representations of utterances through the evidence lower bounds. Moreover, we constructed synthetic and simulated datasets incorporating implicit causes and complete cause labels, alleviating the current situation where all available datasets are implicit-causes-agnostic. Extensive experiments demonstrate that our proposed method significantly outperforms existing methods on synthetic, simulated, and real-world datasets. Finally, we analyze the performance of CCM under latent confounders and propose theoretical ideas for addressing this currently unresolved issue.

Learning a Structural Causal Model for Intuition Reasoning in Conversation

TL;DR

This work addresses the challenge of intuitive conversation reasoning by introducing a Conversation Cognitive Model (CCM) that integrates perception, mental state, and plans to explain utterance generation. It algebraically transforms CCM into a Structural Causal Model (SCM), simplifying via latent projection and mediator omission to yield observable Utterances and latent Mentally State influences, enabling a causal representation learned with a variational framework. A graph-attention encoder infers implicit causes, and a decoder with an autoregressive SCM propagates those causes to reconstruct utterances, optimized through an evidence lower bound (ELBO). The authors provide synthetic and simulation datasets with complete causal structures to enable evaluation beyond implicit-cause datasets, and demonstrate state-of-the-art performance on explicit cause extraction (ECE) and implicit cause extraction (ICE) tasks across real, synthetic, and simulated data, while also discussing latent confounding and intervention-based analysis. The approach advances interpretable, causally grounded conversation reasoning with potential applications in affective reasoning, dialogue generation, and robust causal discovery in language tasks.

Abstract

Reasoning, a crucial aspect of NLP research, has not been adequately addressed by prevailing models including Large Language Model. Conversation reasoning, as a critical component of it, remains largely unexplored due to the absence of a well-designed cognitive model. In this paper, inspired by intuition theory on conversation cognition, we develop a conversation cognitive model (CCM) that explains how each utterance receives and activates channels of information recursively. Besides, we algebraically transformed CCM into a structural causal model (SCM) under some mild assumptions, rendering it compatible with various causal discovery methods. We further propose a probabilistic implementation of the SCM for utterance-level relation reasoning. By leveraging variational inference, it explores substitutes for implicit causes, addresses the issue of their unobservability, and reconstructs the causal representations of utterances through the evidence lower bounds. Moreover, we constructed synthetic and simulated datasets incorporating implicit causes and complete cause labels, alleviating the current situation where all available datasets are implicit-causes-agnostic. Extensive experiments demonstrate that our proposed method significantly outperforms existing methods on synthetic, simulated, and real-world datasets. Finally, we analyze the performance of CCM under latent confounders and propose theoretical ideas for addressing this currently unresolved issue.
Paper Structure (32 sections, 16 equations, 6 figures, 5 tables)

This paper contains 32 sections, 16 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Conversation Cognitive Model (CCM) of an intuitive theory of dialogue. "Perception" represents the speaker's understanding of previous utterances. "Mental State" represents desires, memory, experience or emotion of the speaker. "Plan" stands for the reaction to the latest perception and mental state and is expressed by two external outcomes: "Utterance" and "Action."
  • Figure 2: SCM transformed by CCM. The top half of the image shows the cognitive model with causal irrelevant variables omitted, consisting only of explicit utterances and implicit mental states. The bottom half parameterizes this model with a five-utterance conversation example, where $U$ represents utterances, $E$ represents mental states, and subscripts denote the natural sequential index of utterances in the conversation.
  • Figure 3: A probabilistic framework of our method. $q_{\varphi}(z|\mathcal{X})$ predicts the implicit causes $E$ from the input $x$ (e.g., exogenous variable matrix in SCM). The decoder $p_{\theta}((x|(I-A)^{-1}E))$ learns to reconstruct $\widehat{X}$ given the $E$ and inverse of predicted $z$.
  • Figure 4: Four causal structures in the simulation dataset.
  • Figure 5: Visualization of $E$ (a-d subfigures) and implicit causes (e subfigure) with colors in the synthetic datasets. The gray cluster means padding utterances in each dialogue, the blue cluster corresponds to the non-emotion utterances, and the red cluster corresponds to emotion utterances.
  • ...and 1 more figures

Theorems & Definitions (2)

  • Definition 1: Structural Causal Model
  • Definition 2: SCM in the conversation