Modeling Future Conversation Turns to Teach LLMs to Ask Clarifying Questions

Michael J. Q. Zhang; W. Bradley Knox; Eunsol Choi

Modeling Future Conversation Turns to Teach LLMs to Ask Clarifying Questions

Michael J. Q. Zhang, W. Bradley Knox, Eunsol Choi

TL;DR

Ambiguity in user requests significantly challenges LLMs. The authors introduce double-turn preference labeling by simulating future turns to train LLMs to ask clarifying questions and tailor final answers to each interpretation. An automatic evaluation framework using simulated users on open-domain QA demonstrates consistent improvements in $F_1$ (about $5\%$) and in determining when clarification is needed (about $3\%$ accuracy). The results show that modeling future turns via double-turn preferences yields more effective and efficient clarifying interactions, with code and data released to support further research.

Abstract

Large language models (LLMs) must often respond to highly ambiguous user requests. In such cases, the LLM's best response may be to ask a clarifying question to elicit more information. Existing LLMs often respond by presupposing a single interpretation of such ambiguous requests, frustrating users who intended a different interpretation. We speculate this is caused by current preference data labeling practice, where LLM responses are evaluated only on their prior contexts. To address this, we assign preference labels by simulating their expected outcomes in future turns. This allows LLMs to learn to ask clarifying questions when it can generate responses that are tailored to each user interpretation in future turns. On open-domain QA datasets with multiple annotations, we evaluate systems based on their ability to ask clarifying questions to recover each user's interpretation and expected answer. We compare systems trained using our proposed preference labeling methods against standard methods, which assign preferences based on only prior context. Our method achieves a 5% improvement in F1 measured against the answer set from different interpretations of each query, showing the value of modeling future conversation turns. We further demonstrate that our method can be used to train models to judiciously determine when to ask clarifying questions, directly answering the question when clarification is unnecessary. In our experiments, we find that our method achieves a 3% improvement in accuracy of such judgments over existing methods.

Modeling Future Conversation Turns to Teach LLMs to Ask Clarifying Questions

TL;DR

Abstract

Modeling Future Conversation Turns to Teach LLMs to Ask Clarifying Questions

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)