Table of Contents
Fetching ...

Estimating Causal Effects of Text Interventions Leveraging LLMs

Siyi Guo, Myrl G. Marmarelis, Fred Morstatter, Kristina Lerman

TL;DR

This paper tackles the challenge of estimating causal effects when the treatment is a text intervention, a problem amplified by high-dimensional language and latent attributes like anger. It introduces CausalDANN, which uses LLM-driven text transformations to define interventions and employs a domain-adversarial neural network to predict outcomes across intervened and non-intervened text, mitigating distribution shift. The authors validate the approach on three semi-synthetic datasets (Amazon Reviews, Reddit AITA, Anger in AITA), showing that CausalDANN often yields more accurate ATE and CATE estimates than vanilla BERT or IPW, and that TextCause provides a useful upper-bound reference. The work advances causal inference for text by enabling direct text interventions and robust outcome prediction under domain shift, with important caveats about biases in LLM-generated data and the need for careful validation in real-world settings.

Abstract

Quantifying the effects of textual interventions in social systems, such as reducing anger in social media posts to see its impact on engagement, is challenging. Real-world interventions are often infeasible, necessitating reliance on observational data. Traditional causal inference methods, typically designed for binary or discrete treatments, are inadequate for handling the complex, high-dimensional textual data. This paper addresses these challenges by proposing CausalDANN, a novel approach to estimate causal effects using text transformations facilitated by large language models (LLMs). Unlike existing methods, our approach accommodates arbitrary textual interventions and leverages text-level classifiers with domain adaptation ability to produce robust effect estimates against domain shifts, even when only the control group is observed. This flexibility in handling various text interventions is a key advancement in causal estimation for textual data, offering opportunities to better understand human behaviors and develop effective interventions within social systems.

Estimating Causal Effects of Text Interventions Leveraging LLMs

TL;DR

This paper tackles the challenge of estimating causal effects when the treatment is a text intervention, a problem amplified by high-dimensional language and latent attributes like anger. It introduces CausalDANN, which uses LLM-driven text transformations to define interventions and employs a domain-adversarial neural network to predict outcomes across intervened and non-intervened text, mitigating distribution shift. The authors validate the approach on three semi-synthetic datasets (Amazon Reviews, Reddit AITA, Anger in AITA), showing that CausalDANN often yields more accurate ATE and CATE estimates than vanilla BERT or IPW, and that TextCause provides a useful upper-bound reference. The work advances causal inference for text by enabling direct text interventions and robust outcome prediction under domain shift, with important caveats about biases in LLM-generated data and the need for careful validation in real-world settings.

Abstract

Quantifying the effects of textual interventions in social systems, such as reducing anger in social media posts to see its impact on engagement, is challenging. Real-world interventions are often infeasible, necessitating reliance on observational data. Traditional causal inference methods, typically designed for binary or discrete treatments, are inadequate for handling the complex, high-dimensional textual data. This paper addresses these challenges by proposing CausalDANN, a novel approach to estimate causal effects using text transformations facilitated by large language models (LLMs). Unlike existing methods, our approach accommodates arbitrary textual interventions and leverages text-level classifiers with domain adaptation ability to produce robust effect estimates against domain shifts, even when only the control group is observed. This flexibility in handling various text interventions is a key advancement in causal estimation for textual data, offering opportunities to better understand human behaviors and develop effective interventions within social systems.

Paper Structure

This paper contains 37 sections, 8 equations, 3 figures, 8 tables.

Figures (3)

  • Figure 1: The causal diagram of the problem setup. We aim to estimate the effect from the treatment T to the outcome Y, accounting for confounding and/or non-confounding covariates.
  • Figure 2: We first apply an LLM transformation or sampling to the observed text and outcome (non-intervened group) to generate text data for the intervened group. The outcomes for the transformed data remain unobserved. To predict the outcomes, we use (a) the BERT-based baseline predictor or (b) the proposed CausalDANN with domain adaptation. We then predict outcomes for both groups and compute the causal effects.
  • Figure 3: GPT generated AITA verdicts in different (a) age and (b) gender groups. We use regex to capture these.