Table of Contents
Fetching ...

Generative Framework for Personalized Persuasion: Inferring Causal, Counterfactual, and Latent Knowledge

Donghuo Zeng, Roberto Legaspi, Yuewen Sun, Xinshuai Dong, Kazushi Ikeda, Peter Spirtes, Kun Zhang

TL;DR

The paper proposes a generative framework that combines causal discovery, counterfactual inference, and latent knowledge estimation to personalize persuasive dialogue. By representing dialogues as state-action sequences, recovering latent OCEAN traits with TP3M, and uncovering strategy-level causality via GRaSP, it generates principled counterfactual actions through BiCoGAN and CI-KQR, then optimizes dialogue policies with D3QN. Key findings show that incorporating causal structure and latent factors substantially improves persuasive outcomes (donations) and Q-values, with strong latent-trait predictions (R^2 ≈ 0.83) and robust causal graphs. The work advances socially beneficial persuasive AI, demonstrating improved data efficiency, interpretability via counterfactuals, and adaptable personalization, while addressing ethics and reproducibility considerations for real-world deployment.

Abstract

We hypothesize that optimal system responses emerge from adaptive strategies grounded in causal and counterfactual knowledge. Counterfactual inference allows us to create hypothetical scenarios to examine the effects of alternative system responses. We enhance this process through causal discovery, which identifies the strategies informed by the underlying causal structure that govern system behaviors. Moreover, we consider the psychological constructs and unobservable noises that might be influencing user-system interactions as latent factors. We show that these factors can be effectively estimated. We employ causal discovery to identify strategy-level causal relationships among user and system utterances, guiding the generation of personalized counterfactual dialogues. We model the user utterance strategies as causal factors, enabling system strategies to be treated as counterfactual actions. Furthermore, we optimize policies for selecting system responses based on counterfactual data. Our results using a real-world dataset on social good demonstrate significant improvements in persuasive system outcomes, with increased cumulative rewards validating the efficacy of causal discovery in guiding personalized counterfactual inference and optimizing dialogue policies for a persuasive dialogue system.

Generative Framework for Personalized Persuasion: Inferring Causal, Counterfactual, and Latent Knowledge

TL;DR

The paper proposes a generative framework that combines causal discovery, counterfactual inference, and latent knowledge estimation to personalize persuasive dialogue. By representing dialogues as state-action sequences, recovering latent OCEAN traits with TP3M, and uncovering strategy-level causality via GRaSP, it generates principled counterfactual actions through BiCoGAN and CI-KQR, then optimizes dialogue policies with D3QN. Key findings show that incorporating causal structure and latent factors substantially improves persuasive outcomes (donations) and Q-values, with strong latent-trait predictions (R^2 ≈ 0.83) and robust causal graphs. The work advances socially beneficial persuasive AI, demonstrating improved data efficiency, interpretability via counterfactuals, and adaptable personalization, while addressing ethics and reproducibility considerations for real-world deployment.

Abstract

We hypothesize that optimal system responses emerge from adaptive strategies grounded in causal and counterfactual knowledge. Counterfactual inference allows us to create hypothetical scenarios to examine the effects of alternative system responses. We enhance this process through causal discovery, which identifies the strategies informed by the underlying causal structure that govern system behaviors. Moreover, we consider the psychological constructs and unobservable noises that might be influencing user-system interactions as latent factors. We show that these factors can be effectively estimated. We employ causal discovery to identify strategy-level causal relationships among user and system utterances, guiding the generation of personalized counterfactual dialogues. We model the user utterance strategies as causal factors, enabling system strategies to be treated as counterfactual actions. Furthermore, we optimize policies for selecting system responses based on counterfactual data. Our results using a real-world dataset on social good demonstrate significant improvements in persuasive system outcomes, with increased cumulative rewards validating the efficacy of causal discovery in guiding personalized counterfactual inference and optimizing dialogue policies for a persuasive dialogue system.

Paper Structure

This paper contains 20 sections, 8 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: Schema of our hypothesis. An observed state transition $\{(s_{t}, a_{t}, s_{t+1})\}^{T-1}_{t=0}$, where $s_t$ is the user state at time $t$, $a_t$ is the action taken by the system, and $s_{t+1}$ is the next user state, follows the rules of a structural causal model Pearl2009 while accounting for hidden psychological constructs ($\mathcal{L}$) and unobserved noises ($\varepsilon$). Identified causal relations among strategies make it easier to draw well-founded counterfactual states and actions to achieve the desired persuasion outcomes.
  • Figure 2: Generative framework for personalized causality-driven counterfactual persuasions. We represent the persuadEE and persuadER utterances as BERT-embedded state-action sequences. TP3M recovers the latent personality dimensions using the state-action sequences and the associated EE OCEAN values in every dialogue. The strategies associated with the utterances are obtained by fine-tuning two GPT-2 models, and the causal relationships between EE and ER strategies are discovered by GRaSP, which will then guide the construction of counterfactual actions. The counterfactual data are created using BiCoGAN, or via kernel quantile regression (not shown here but in Fig. \ref{['fig:kqr_model']}), by utilizing the estimated latent personality dimensions and the counterfactual actions obtained by the RB model. Finally, D3QN learns the optimal system response policies, with the donation amounts as reward values, to improve persuasion outcomes.
  • Figure 3: Novel extension of BiCoGAN. (a) Training with TP3M, generator $G$, encoder $E$, and discriminator $D$. (b) Discovered casual graph links the GPT-2 and retrieval-based models to generate via the trained generator $G_{T}$ the counterfactual action $a^{'}_{t}$ and next state $s^{'}_{t+1}$.
  • Figure 4: KQR employs two networks: ParentNet optimizes $\tau$, yielding $\hat{\tau}$, which controls the prediction of counterfactual next state $S'_{t+1}$ by ChildNet based on ($S_{t}$, $A'_{t}$, $\mathcal{\hat{L}}_{+1}$) as input.
  • Figure 5: Top four CCA correlations, along with their distributions, verify the strong correlation between the ground truth and the TP3M-predicted OCEAN values.
  • ...and 3 more figures