Table of Contents
Fetching ...

STaR-GATE: Teaching Language Models to Ask Clarifying Questions

Chinmaya Andukuri, Jan-Philipp Fränken, Tobias Gerstenberg, Noah D. Goodman

TL;DR

Task ambiguity in prompted language models impairs personalized responses when user preferences are unknown. STaR-GATE introduces a self-improving elicitation loop that trains a Questioner to extract useful preferences from a Roleplayer, optimizing for the gold response probability produced by an Oracle with persona access. Through a synthetic 25,500-prompt dataset and two training iterations, the finetuned Questioner generates higher-quality questions and outputs that GPT-4 prefers over the initial model on a majority of tasks (≈72%). Ablation studies show that response regularization is critical to maintain answer quality and avoid hallucinations, while training on gold responses can lead to issues. Overall, the paper demonstrates that teaching LMs to ask better questions yields clearer, more personalized responses and generalizes across roleplayers.

Abstract

When prompting language models to complete a task, users often leave important aspects unsaid. While asking questions could resolve this ambiguity (GATE; Li et al., 2023), models often struggle to ask good questions. We explore a language model's ability to self-improve (STaR; Zelikman et al., 2022) by rewarding the model for generating useful questions-a simple method we dub STaR-GATE. We generate a synthetic dataset of 25,500 unique persona-task prompts to simulate conversations between a pretrained language model-the Questioner-and a Roleplayer whose preferences are unknown to the Questioner. By asking questions, the Questioner elicits preferences from the Roleplayer. The Questioner is iteratively finetuned on questions that increase the probability of high-quality responses to the task, which are generated by an Oracle with access to the Roleplayer's latent preferences. After two iterations of self-improvement, the Questioner asks better questions, allowing it to generate responses that are preferred over responses from the initial model on 72% of tasks. Our results indicate that teaching a language model to ask better questions leads to better personalized responses.

STaR-GATE: Teaching Language Models to Ask Clarifying Questions

TL;DR

Task ambiguity in prompted language models impairs personalized responses when user preferences are unknown. STaR-GATE introduces a self-improving elicitation loop that trains a Questioner to extract useful preferences from a Roleplayer, optimizing for the gold response probability produced by an Oracle with persona access. Through a synthetic 25,500-prompt dataset and two training iterations, the finetuned Questioner generates higher-quality questions and outputs that GPT-4 prefers over the initial model on a majority of tasks (≈72%). Ablation studies show that response regularization is critical to maintain answer quality and avoid hallucinations, while training on gold responses can lead to issues. Overall, the paper demonstrates that teaching LMs to ask better questions yields clearer, more personalized responses and generalizes across roleplayers.

Abstract

When prompting language models to complete a task, users often leave important aspects unsaid. While asking questions could resolve this ambiguity (GATE; Li et al., 2023), models often struggle to ask good questions. We explore a language model's ability to self-improve (STaR; Zelikman et al., 2022) by rewarding the model for generating useful questions-a simple method we dub STaR-GATE. We generate a synthetic dataset of 25,500 unique persona-task prompts to simulate conversations between a pretrained language model-the Questioner-and a Roleplayer whose preferences are unknown to the Questioner. By asking questions, the Questioner elicits preferences from the Roleplayer. The Questioner is iteratively finetuned on questions that increase the probability of high-quality responses to the task, which are generated by an Oracle with access to the Roleplayer's latent preferences. After two iterations of self-improvement, the Questioner asks better questions, allowing it to generate responses that are preferred over responses from the initial model on 72% of tasks. Our results indicate that teaching a language model to ask better questions leads to better personalized responses.
Paper Structure (23 sections, 1 equation, 17 figures, 1 algorithm)

This paper contains 23 sections, 1 equation, 17 figures, 1 algorithm.

Figures (17)

  • Figure 1: Problem Illustration. When user preferences are unknown, language models may respond ineffectively. By asking questions, models can elicit information and provide more effective responses.
  • Figure 2: Overview of STaR-GATE. A task is given to a Questioner who elicits preferences from a Roleplayer whose persona is unknown to the Questioner. The resulting conversations are then filtered based on the log probability of a gold response generated by an Oracle which has access to the Roleplayer's persona (omitted from the diagram for clarity). We then fine-tune the Questioner on the filtered questions. Moreover---to avoid distribution shift---we regularize the Questioner by additionally sampling responses conditioned on the filtered conversations. In our ablations, we contrast fine-tuning on sampled responses with fine-tuning on the gold responses.
  • Figure 3: Win Rates Against Initial Model. [a] Complete method and ablations: w/o Reg. refers to finetuning on questions only, which decreases the model's ability to generate answers. w/ Gold Resp. refers to finetuning directly on the gold responses rather than self-generated responses, which leads to hallucinations in generated answers. [b] Roleplayer generalization results. We demonstrate that STaR-GATE generalizes beyond the roleplayer it was trained against (mixtral-8x7b). All three roleplayers correspond to the instruct version of their respective models. Error bars represent the standard error of the mean ($\pm$ SEM). We include $0.5$ (chance) as a reference point for iteration $t=0$.
  • Figure 4: Log Probability of Gold Responses. Log probabilities of gold responses increase over iterations for both [a] STaR-GATE and [b] STaR-GATE w/o Regularization. Error bars correspond to $\pm$ SEM calculated across held-out persona-task prompts.
  • Figure 5: Additional Ablation Results. [a] Log probability of gold responses for STaR-GATE w/ gold response. [b] Win Rates for STaR-GATE using $Q_{BASE}$ to generate responses at each iteration. Error bars correspond to $\pm$ SEM.
  • ...and 12 more figures