Estimating Causal Effects of Text Interventions Leveraging LLMs

Siyi Guo; Myrl G. Marmarelis; Fred Morstatter; Kristina Lerman

Estimating Causal Effects of Text Interventions Leveraging LLMs

Siyi Guo, Myrl G. Marmarelis, Fred Morstatter, Kristina Lerman

TL;DR

This paper tackles the challenge of estimating causal effects when the treatment is a text intervention, a problem amplified by high-dimensional language and latent attributes like anger. It introduces CausalDANN, which uses LLM-driven text transformations to define interventions and employs a domain-adversarial neural network to predict outcomes across intervened and non-intervened text, mitigating distribution shift. The authors validate the approach on three semi-synthetic datasets (Amazon Reviews, Reddit AITA, Anger in AITA), showing that CausalDANN often yields more accurate ATE and CATE estimates than vanilla BERT or IPW, and that TextCause provides a useful upper-bound reference. The work advances causal inference for text by enabling direct text interventions and robust outcome prediction under domain shift, with important caveats about biases in LLM-generated data and the need for careful validation in real-world settings.

Abstract

Quantifying the effects of textual interventions in social systems, such as reducing anger in social media posts to see its impact on engagement, is challenging. Real-world interventions are often infeasible, necessitating reliance on observational data. Traditional causal inference methods, typically designed for binary or discrete treatments, are inadequate for handling the complex, high-dimensional textual data. This paper addresses these challenges by proposing CausalDANN, a novel approach to estimate causal effects using text transformations facilitated by large language models (LLMs). Unlike existing methods, our approach accommodates arbitrary textual interventions and leverages text-level classifiers with domain adaptation ability to produce robust effect estimates against domain shifts, even when only the control group is observed. This flexibility in handling various text interventions is a key advancement in causal estimation for textual data, offering opportunities to better understand human behaviors and develop effective interventions within social systems.

Estimating Causal Effects of Text Interventions Leveraging LLMs

TL;DR

Abstract

Estimating Causal Effects of Text Interventions Leveraging LLMs

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)