Table of Contents
Fetching ...

Persuasiveness of Generated Free-Text Rationales in Subjective Decisions: A Case Study on Pairwise Argument Ranking

Mohamed Elaraby, Diane Litman, Xiang Lorraine Li, Ahmed Magooda

TL;DR

This work analyzes generated free-text rationales in tasks with subjective answers and suggests that open-source LLMs, particularly Llama2-70B-chat, are capable of providing highly persuasive rationalizations, surpassing even GPT models.

Abstract

Generating free-text rationales is among the emergent capabilities of Large Language Models (LLMs). These rationales have been found to enhance LLM performance across various NLP tasks. Recently, there has been growing interest in using these rationales to provide insights for various important downstream tasks. In this paper, we analyze generated free-text rationales in tasks with subjective answers, emphasizing the importance of rationalization in such scenarios. We focus on pairwise argument ranking, a highly subjective task with significant potential for real-world applications, such as debate assistance. We evaluate the persuasiveness of rationales generated by nine LLMs to support their subjective choices. Our findings suggest that open-source LLMs, particularly Llama2-70B-chat, are capable of providing highly persuasive rationalizations, surpassing even GPT models. Additionally, our experiments show that rationale persuasiveness can be improved by controlling its parameters through prompting or through self-refinement.

Persuasiveness of Generated Free-Text Rationales in Subjective Decisions: A Case Study on Pairwise Argument Ranking

TL;DR

This work analyzes generated free-text rationales in tasks with subjective answers and suggests that open-source LLMs, particularly Llama2-70B-chat, are capable of providing highly persuasive rationalizations, surpassing even GPT models.

Abstract

Generating free-text rationales is among the emergent capabilities of Large Language Models (LLMs). These rationales have been found to enhance LLM performance across various NLP tasks. Recently, there has been growing interest in using these rationales to provide insights for various important downstream tasks. In this paper, we analyze generated free-text rationales in tasks with subjective answers, emphasizing the importance of rationalization in such scenarios. We focus on pairwise argument ranking, a highly subjective task with significant potential for real-world applications, such as debate assistance. We evaluate the persuasiveness of rationales generated by nine LLMs to support their subjective choices. Our findings suggest that open-source LLMs, particularly Llama2-70B-chat, are capable of providing highly persuasive rationalizations, surpassing even GPT models. Additionally, our experiments show that rationale persuasiveness can be improved by controlling its parameters through prompting or through self-refinement.
Paper Structure (48 sections, 1 equation, 12 figures, 9 tables)

This paper contains 48 sections, 1 equation, 12 figures, 9 tables.

Figures (12)

  • Figure 1: Given two arguments with the same stance on a topic, the model selects the higher quality argument and generates a convincing rationale. We analyze the persuasiveness of these rationales.
  • Figure 2: For the input argument pair and rationale, we filter out invalid or repetitive rationales (Section \ref{['subsec:basic_form']}). The qualified rationales are then analyzed based on their content (Section \ref{['subsection:content_eval']}) and persuasiveness (Section \ref{['subsec:persuasive_eval']}).
  • Figure 3: Persuasion Ranking vs F-score
  • Figure 4: SHAPLey values of each feature. The higher the value , the higher the impact on average persuasiveness rank.
  • Figure 5: Contrast and Novelty % in different categories of rationale rating.
  • ...and 7 more figures