Persuasiveness of Generated Free-Text Rationales in Subjective Decisions: A Case Study on Pairwise Argument Ranking

Mohamed Elaraby; Diane Litman; Xiang Lorraine Li; Ahmed Magooda

Persuasiveness of Generated Free-Text Rationales in Subjective Decisions: A Case Study on Pairwise Argument Ranking

Mohamed Elaraby, Diane Litman, Xiang Lorraine Li, Ahmed Magooda

TL;DR

This work analyzes generated free-text rationales in tasks with subjective answers and suggests that open-source LLMs, particularly Llama2-70B-chat, are capable of providing highly persuasive rationalizations, surpassing even GPT models.

Abstract

Generating free-text rationales is among the emergent capabilities of Large Language Models (LLMs). These rationales have been found to enhance LLM performance across various NLP tasks. Recently, there has been growing interest in using these rationales to provide insights for various important downstream tasks. In this paper, we analyze generated free-text rationales in tasks with subjective answers, emphasizing the importance of rationalization in such scenarios. We focus on pairwise argument ranking, a highly subjective task with significant potential for real-world applications, such as debate assistance. We evaluate the persuasiveness of rationales generated by nine LLMs to support their subjective choices. Our findings suggest that open-source LLMs, particularly Llama2-70B-chat, are capable of providing highly persuasive rationalizations, surpassing even GPT models. Additionally, our experiments show that rationale persuasiveness can be improved by controlling its parameters through prompting or through self-refinement.

Persuasiveness of Generated Free-Text Rationales in Subjective Decisions: A Case Study on Pairwise Argument Ranking

TL;DR

Abstract

Paper Structure (48 sections, 1 equation, 12 figures, 9 tables)

This paper contains 48 sections, 1 equation, 12 figures, 9 tables.

Introduction
Related Work
Argument Quality Ranking
LLMs for Argument Quality Ranking
Evaluating Free-Text Rationalization
Persuasiveness in LLMs
Experimental Settings
Datasets
Models
Considered LLMs
Prompting LLMs for Ranking Arguments and Generating Rationales
Rationale Evaluation
Basic-Form Evaluation
Content Evaluation
Persuasiveness Evaluation
...and 33 more sections

Figures (12)

Figure 1: Given two arguments with the same stance on a topic, the model selects the higher quality argument and generates a convincing rationale. We analyze the persuasiveness of these rationales.
Figure 2: For the input argument pair and rationale, we filter out invalid or repetitive rationales (Section \ref{['subsec:basic_form']}). The qualified rationales are then analyzed based on their content (Section \ref{['subsection:content_eval']}) and persuasiveness (Section \ref{['subsec:persuasive_eval']}).
Figure 3: Persuasion Ranking vs F-score
Figure 4: SHAPLey values of each feature. The higher the value , the higher the impact on average persuasiveness rank.
Figure 5: Contrast and Novelty % in different categories of rationale rating.
...and 7 more figures

Persuasiveness of Generated Free-Text Rationales in Subjective Decisions: A Case Study on Pairwise Argument Ranking

TL;DR

Abstract

Persuasiveness of Generated Free-Text Rationales in Subjective Decisions: A Case Study on Pairwise Argument Ranking

Authors

TL;DR

Abstract

Table of Contents

Figures (12)