Table of Contents
Fetching ...

Can LLMs Speak For Diverse People? Tuning LLMs via Debate to Generate Controllable Controversial Statements

Ming Li, Jiuhai Chen, Lichang Chen, Tianyi Zhou

TL;DR

This work tackles the challenge of making LLMs controllable enough to generate statements that align with user-defined, potentially minority, stances on controversial topics. It introduces DEBATunE, a two-phase pipeline where two opposing LLM agents debate a topic to produce high-quality, stance-consistent statements, which are then used to fine-tune a controllable model. A large 710-topic Debate Dataset provides ground-truth statements (7100 total) to train on both positive and negative stances. Evaluation with a GPT-4 judge on Controversy Controllability and Response Quality, plus a human study, shows DEBATunE improves the ability to speak for diverse viewpoints and generalizes to unseen topics, also enhancing instruction-following performance after fine-tuning.

Abstract

Making LLMs speak for different, especially minority groups of people, and generate statements supporting their diverse or even controversial perspectives is critical to creating an inclusive environment. However, existing LLMs lack sufficient controllability to the stance of their generated content, which often contains inconsistent, neutral, or biased statements. In this paper, we improve the controllability of LLMs in generating statements supporting an argument the user defined in the prompt. We find that multi-round debates between two LLMs with opposite stances generate higher-quality and more salient statements for each, which are important training data to improve the controllability of LLMs. Motivated by this, we develop a novel debate & tuning (DEBATUNE) pipeline finetuning LLMs to generate the statements obtained via debate. To examine DEBATUNE, we curate the largest dataset of debate topics so far, which covers 710 controversial topics and corresponding arguments for each topic. Evaluations by the GPT-4 judge with a novel controversy controllability metric show that LLMs' capability of generating diverse perspectives is significantly improved by DEBATUNE. Moreover, such controllability can be generalized to unseen topics, generating high-quality statements supporting controversial arguments.

Can LLMs Speak For Diverse People? Tuning LLMs via Debate to Generate Controllable Controversial Statements

TL;DR

This work tackles the challenge of making LLMs controllable enough to generate statements that align with user-defined, potentially minority, stances on controversial topics. It introduces DEBATunE, a two-phase pipeline where two opposing LLM agents debate a topic to produce high-quality, stance-consistent statements, which are then used to fine-tune a controllable model. A large 710-topic Debate Dataset provides ground-truth statements (7100 total) to train on both positive and negative stances. Evaluation with a GPT-4 judge on Controversy Controllability and Response Quality, plus a human study, shows DEBATunE improves the ability to speak for diverse viewpoints and generalizes to unseen topics, also enhancing instruction-following performance after fine-tuning.

Abstract

Making LLMs speak for different, especially minority groups of people, and generate statements supporting their diverse or even controversial perspectives is critical to creating an inclusive environment. However, existing LLMs lack sufficient controllability to the stance of their generated content, which often contains inconsistent, neutral, or biased statements. In this paper, we improve the controllability of LLMs in generating statements supporting an argument the user defined in the prompt. We find that multi-round debates between two LLMs with opposite stances generate higher-quality and more salient statements for each, which are important training data to improve the controllability of LLMs. Motivated by this, we develop a novel debate & tuning (DEBATUNE) pipeline finetuning LLMs to generate the statements obtained via debate. To examine DEBATUNE, we curate the largest dataset of debate topics so far, which covers 710 controversial topics and corresponding arguments for each topic. Evaluations by the GPT-4 judge with a novel controversy controllability metric show that LLMs' capability of generating diverse perspectives is significantly improved by DEBATUNE. Moreover, such controllability can be generalized to unseen topics, generating high-quality statements supporting controversial arguments.
Paper Structure (25 sections, 5 equations, 5 figures, 4 tables)

This paper contains 25 sections, 5 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: The pipeline of DEBATunE. In the Debate phase (top), the agents are prompted to debate upon the given topic with an argument. After several rounds of debate, an agent (positive in the example) concludes the debate based on all the previous debate records. The conclusion is a more salient, detailed, and higher-quality statement for the agent. It will be used to train an LLM in the Training phase (bottom) to improve the controllability of generating statements for the given stance (positive in the example).
  • Figure 2: Comparing existing LLMs and DEBATunE-trained LLMs.(a,c) Given the controversial topic, "Organ donation should be mandatory", and a user-defined stance (left: positive; right: negative), Vicuna 7B v1.5 cannot always generate consistent statements supporting the stance and lacks controllability. It exhibits a bias towards the positive stance and ignores the user's negative stance and religious concerns in (c), which may lead to an offensive statement. (b,d) On the contrary, DEBATunE-trained model generates higher-quality and strong statements that strictly adhere to the user stance (positive or negative).
  • Figure 3: Structure of our debate dataset. There are $710$ controversial debate topics. Each topic $t$ allows a positive stance $p_t$ and a negative stance $n_t$, where $p_t$ agrees with the topic and $n_t$ is against it. We use gpt-3.5-turbo-1106 to generate 5 one-sentence arguments supporting each stance, e.g., $a_{p_{t}, i}$ is the $i$-th argument for the positive stance on topic $t$. Given an argument $a_{p_{t}, i}$, a controllable LLM is expected to generate a supporting statement $s_{p_{t}, i}$ with detailed explanations and evidence.
  • Figure 4: The prompt to evaluate the Response Quality.
  • Figure 5: The prompt to evaluate the Controversy Controllability.