Table of Contents
Fetching ...

A Hybrid Theory and Data-driven Approach to Persuasion Detection with Large Language Models

Gia Bao Hoang, Keith J Ransom, Rachel Stephens, Carolyn Semmler, Nicolas Fay, Lewis Mitchell

TL;DR

The paper tackles scalable prediction of belief change in online discourse by integrating psychological theory with large-language-model (LLM) derived feature ratings. It introduces a hybrid framework that uses Truth Wins-based cognitive features, transformed into LLM-rated inputs, combined with belief-update signals in a Random Forest to predict persuasion success in the CMV Winning Arguments dataset. Hybrid models outperform theory-driven and pure data-driven baselines, with epistemic-emotion-related features (notably Interesting-If-True) and Shareable emerging as strong predictors; cross-LLM ratings correlate, though absolute scales differ. The work advances interpretable, theory-grounded persuasion analytics with potential applications in influence campaign detection and misinformation mitigation, while acknowledging limitations from dataset scope and LLM biases and suggesting avenues for future refinement.

Abstract

Traditional psychological models of belief revision focus on face-to-face interactions, but with the rise of social media, more effective models are needed to capture belief revision at scale, in this rich text-based online discourse. Here, we use a hybrid approach, utilizing large language models (LLMs) to develop a model that predicts successful persuasion using features derived from psychological experiments. Our approach leverages LLM generated ratings of features previously examined in the literature to build a random forest classification model that predicts whether a message will result in belief change. Of the eight features tested, \textit{epistemic emotion} and \textit{willingness to share} were the top-ranking predictors of belief change in the model. Our findings provide insights into the characteristics of persuasive messages and demonstrate how LLMs can enhance models of successful persuasion based on psychological theory. Given these insights, this work has broader applications in fields such as online influence detection and misinformation mitigation, as well as measuring the effectiveness of online narratives.

A Hybrid Theory and Data-driven Approach to Persuasion Detection with Large Language Models

TL;DR

The paper tackles scalable prediction of belief change in online discourse by integrating psychological theory with large-language-model (LLM) derived feature ratings. It introduces a hybrid framework that uses Truth Wins-based cognitive features, transformed into LLM-rated inputs, combined with belief-update signals in a Random Forest to predict persuasion success in the CMV Winning Arguments dataset. Hybrid models outperform theory-driven and pure data-driven baselines, with epistemic-emotion-related features (notably Interesting-If-True) and Shareable emerging as strong predictors; cross-LLM ratings correlate, though absolute scales differ. The work advances interpretable, theory-grounded persuasion analytics with potential applications in influence campaign detection and misinformation mitigation, while acknowledging limitations from dataset scope and LLM biases and suggesting avenues for future refinement.

Abstract

Traditional psychological models of belief revision focus on face-to-face interactions, but with the rise of social media, more effective models are needed to capture belief revision at scale, in this rich text-based online discourse. Here, we use a hybrid approach, utilizing large language models (LLMs) to develop a model that predicts successful persuasion using features derived from psychological experiments. Our approach leverages LLM generated ratings of features previously examined in the literature to build a random forest classification model that predicts whether a message will result in belief change. Of the eight features tested, \textit{epistemic emotion} and \textit{willingness to share} were the top-ranking predictors of belief change in the model. Our findings provide insights into the characteristics of persuasive messages and demonstrate how LLMs can enhance models of successful persuasion based on psychological theory. Given these insights, this work has broader applications in fields such as online influence detection and misinformation mitigation, as well as measuring the effectiveness of online narratives.

Paper Structure

This paper contains 25 sections, 4 figures, 1 table.

Figures (4)

  • Figure 1: Flowchart showing the main steps of our hybrid model approach.
  • Figure 2: LLMs Generated Ratings Heatmap - This visualizes the ratings assigned by each LLM for each analysed feature (x-axis) for each argument in the Winning Arguments dataset (y-axis).
  • Figure 3: Spearman Rank Correlation between LLMs' generated ratings. Each cell represents the correlation between the ratings generated by the given pair of LLMs (x-axis) for a given feature (y-axis).
  • Figure 4: Top 5 Most Important Features in the Hybrid Model (Random Forest), with Independent Terms shown on the top row and Interaction Terms on the bottom row. Each bar represents the mean feature importance across 100 permutations, and the error line indicates the standard deviation. For the Mixtral-based models, six features are shown due to a tie in importance scores for the fifth position.