Table of Contents
Fetching ...

Do LLMs Think Fast and Slow? A Causal Study on Sentiment Analysis

Zhiheng Lyu, Zhijing Jin, Fernando Gonzalez, Rada Mihalcea, Bernhard Schölkopf, Mrinmaya Sachan

TL;DR

For the prediction task, the discovered causal mechanisms behind the samples are used to improve LLM performance by proposing causal prompts that give the models an inductive bias of the underlying causal graph, leading to substantial improvements by up to 32.13 F1 points on zero-shot five-class SA.

Abstract

Sentiment analysis (SA) aims to identify the sentiment expressed in a text, such as a product review. Given a review and the sentiment associated with it, this work formulates SA as a combination of two tasks: (1) a causal discovery task that distinguishes whether a review "primes" the sentiment (Causal Hypothesis C1), or the sentiment "primes" the review (Causal Hypothesis C2); and (2) the traditional prediction task to model the sentiment using the review as input. Using the peak-end rule in psychology, we classify a sample as C1 if its overall sentiment score approximates an average of all the sentence-level sentiments in the review, and C2 if the overall sentiment score approximates an average of the peak and end sentiments. For the prediction task, we use the discovered causal mechanisms behind the samples to improve LLM performance by proposing causal prompts that give the models an inductive bias of the underlying causal graph, leading to substantial improvements by up to 32.13 F1 points on zero-shot five-class SA. Our code is at https://github.com/cogito233/causal-sa

Do LLMs Think Fast and Slow? A Causal Study on Sentiment Analysis

TL;DR

For the prediction task, the discovered causal mechanisms behind the samples are used to improve LLM performance by proposing causal prompts that give the models an inductive bias of the underlying causal graph, leading to substantial improvements by up to 32.13 F1 points on zero-shot five-class SA.

Abstract

Sentiment analysis (SA) aims to identify the sentiment expressed in a text, such as a product review. Given a review and the sentiment associated with it, this work formulates SA as a combination of two tasks: (1) a causal discovery task that distinguishes whether a review "primes" the sentiment (Causal Hypothesis C1), or the sentiment "primes" the review (Causal Hypothesis C2); and (2) the traditional prediction task to model the sentiment using the review as input. Using the peak-end rule in psychology, we classify a sample as C1 if its overall sentiment score approximates an average of all the sentence-level sentiments in the review, and C2 if the overall sentiment score approximates an average of the peak and end sentiments. For the prediction task, we use the discovered causal mechanisms behind the samples to improve LLM performance by proposing causal prompts that give the models an inductive bias of the underlying causal graph, leading to substantial improvements by up to 32.13 F1 points on zero-shot five-class SA. Our code is at https://github.com/cogito233/causal-sa
Paper Structure (53 sections, 4 equations, 7 figures, 12 tables)

This paper contains 53 sections, 4 equations, 7 figures, 12 tables.

Figures (7)

  • Figure 1: An overview of the paper structure, where we first investigate the causal discovery task, and then use it to improve LLM performance. For each document-level text review, we parse its emotion arc consisting of the sentiment of each sentence in the review, and then use the peak-end rule kahneman1993morekahneman2011thinking to identify whether the overall sentiment is an average of the arc (corresponding to Slow Thinking), or an average of the peak and end sentiments (corresponding to Fast Thinking).
  • Figure 2: Causal attribution in LLaMa-7B and Alpaca-7B, showing how much each sentence contributes to the prediction probability.
  • Figure 3: The $\lambda_1$-$\lambda_2$ density plots of C1 (above) and C2 (below).
  • Figure 4: The $\lambda_1$-$\lambda_2$ plot on Yelp-5 (left), Amazon (middle), and App Review (right). We draw the $y=x$ diagonal line, and the orange dots in the upper-left triangle represent the C1-dominant subset, and green dots in the lower-right triangle are the C2-dominant subset.
  • Figure 5: Four emotion arc clusters.
  • ...and 2 more figures