Table of Contents
Fetching ...

Aligning Large Language Models with Human Opinions through Persona Selection and Value--Belief--Norm Reasoning

Do Xuan Long, Kenji Kawaguchi, Min-Yen Kan, Nancy F. Chen

TL;DR

Chain-of-Opinion (COO), a simple four-step solution modeling which and how to reason with personae, inspired by the Value--Belief--Norm (VBN) theory, is proposed, which efficiently achieves new state-of-the-art opinion prediction via prompting with only 5 inference calls.

Abstract

Reasoning and predicting human opinions with large language models (LLMs) is essential yet challenging. Current methods employ role-playing with personae but face two major issues: LLMs are sensitive to even a single irrelevant persona, skewing predictions by up to 30%, and LLMs fail to reason strategically over personae. We propose Chain-of-Opinion (COO), a simple four-step solution modeling which and how to reason with personae, inspired by the Value--Belief--Norm (VBN) theory. COO differentiates between explicit personae (demographics and ideology) and implicit personae (historical opinions), involves: (1) filtering irrelevant attributes from explicit personae, (2) ranking implicit personae into a preferential list for selecting top-k, (3) applying novel VBN reasoning to extract user environmental and personal value, belief, and norm variables for accurate and reliable predictions, and (4) iterating VBN reasoning with progressively larger lists of implicit personae to handle potential persona insufficiency. COO efficiently achieves new state-of-the-art opinion prediction via prompting with only 5 inference calls, improving prior techniques by up to 4%. Notably, fine-tuning LMs with COO data results in significantly better opinion-aligned models, by up to 23%.

Aligning Large Language Models with Human Opinions through Persona Selection and Value--Belief--Norm Reasoning

TL;DR

Chain-of-Opinion (COO), a simple four-step solution modeling which and how to reason with personae, inspired by the Value--Belief--Norm (VBN) theory, is proposed, which efficiently achieves new state-of-the-art opinion prediction via prompting with only 5 inference calls.

Abstract

Reasoning and predicting human opinions with large language models (LLMs) is essential yet challenging. Current methods employ role-playing with personae but face two major issues: LLMs are sensitive to even a single irrelevant persona, skewing predictions by up to 30%, and LLMs fail to reason strategically over personae. We propose Chain-of-Opinion (COO), a simple four-step solution modeling which and how to reason with personae, inspired by the Value--Belief--Norm (VBN) theory. COO differentiates between explicit personae (demographics and ideology) and implicit personae (historical opinions), involves: (1) filtering irrelevant attributes from explicit personae, (2) ranking implicit personae into a preferential list for selecting top-k, (3) applying novel VBN reasoning to extract user environmental and personal value, belief, and norm variables for accurate and reliable predictions, and (4) iterating VBN reasoning with progressively larger lists of implicit personae to handle potential persona insufficiency. COO efficiently achieves new state-of-the-art opinion prediction via prompting with only 5 inference calls, improving prior techniques by up to 4%. Notably, fine-tuning LMs with COO data results in significantly better opinion-aligned models, by up to 23%.
Paper Structure (74 sections, 3 equations, 8 figures, 12 tables)

This paper contains 74 sections, 3 equations, 8 figures, 12 tables.

Figures (8)

  • Figure 1: COO overview with four main steps marked with the nuts. It processes explicit and implicit personae parallelly to facilitate the missing personae scenarios.
  • Figure 2: Consistency scores of the baseline DIO-top8 (ChatGPT) with CoO and CoT.
  • Figure 3: ChatGPT and Mistral-7B-Instruct-v.02 overlap coefficient values for different values of $K$. We observe that for $K$ is large enough ($K \geq 8$), the coefficient value is relatively acceptable ($\geq 0.6$).
  • Figure 4: Left / Middle / Right: Ranking agreements between ChatGPT top-$K$ / ChatGPT-it / Mistral orders and semantic similarity orders. One example that has a high disagreement score is shown in \ref{['appdx:example-disagreement']}.
  • Figure 5: FEA example with ChatGPT.
  • ...and 3 more figures