Causal Inference for Human-Language Model Collaboration
Bohan Zhang, Yixin Wang, Paramveer S. Dhillon
TL;DR
This work addresses causal inference in human-LM collaboration where text-based edits are high-dimensional treatments that defy standard $ATE$ analysis. It introduces Incremental Stylistic Effect (ISE), a local, style-focused estimand with non-parametric identification, and develops CausalCollab to estimate ISE in dynamic interactions by combining CVAEs with G-estimation. Through three diverse datasets, the authors demonstrate that accounting for stylistic changes and latent treatment representations substantially reduces confounding and improves counterfactual prediction compared to baselines. The approach offers a practical, scalable path toward understanding and improving human-LM collaboration by focusing on robust, style-level interventions rather than individual word edits, with implications for designing better interaction protocols and editing strategies.
Abstract
In this paper, we examine the collaborative dynamics between humans and language models (LMs), where the interactions typically involve LMs proposing text segments and humans editing or responding to these proposals. Productive engagement with LMs in such scenarios necessitates that humans discern effective text-based interaction strategies, such as editing and response styles, from historical human-LM interactions. This objective is inherently causal, driven by the counterfactual `what-if' question: how would the outcome of collaboration change if humans employed a different text editing/refinement strategy? A key challenge in answering this causal inference question is formulating an appropriate causal estimand: the conventional average treatment effect (ATE) estimand is inapplicable to text-based treatments due to their high dimensionality. To address this concern, we introduce a new causal estimand -- Incremental Stylistic Effect (ISE) -- which characterizes the average impact of infinitesimally shifting a text towards a specific style, such as increasing formality. We establish the conditions for the non-parametric identification of ISE. Building on this, we develop CausalCollab, an algorithm designed to estimate the ISE of various interaction strategies in dynamic human-LM collaborations. Our empirical investigations across three distinct human-LM collaboration scenarios reveal that CausalCollab effectively reduces confounding and significantly improves counterfactual estimation over a set of competitive baselines.
