Learning to Ask Good Questions: Ranking Clarification Questions using Neural Expected Value of Perfect Information
Sudha Rao, Hal Daumé
TL;DR
The paper tackles how to rank clarification questions by their expected utility, introducing an EVPI-based neural model that jointly learns a question- and answer-generation representation along with a utility predictor. It leverages a Lucene-based candidate generator and a large StackExchange-derived dataset of (post, question, answer) triples to train the model end-to-end. Empirical results show the EVPI approach improves over baselines, especially when evaluated against expert judgments, and the authors release the dataset to support further research. The work suggests promising directions toward reinforcement learning and question-generation, with practical implications for real-time clarification in user-facing systems.
Abstract
Inquiry is fundamental to communication, and machines cannot effectively collaborate with humans unless they can ask questions. In this work, we build a neural network model for the task of ranking clarification questions. Our model is inspired by the idea of expected value of perfect information: a good question is one whose expected answer will be useful. We study this problem using data from StackExchange, a plentiful online resource in which people routinely ask clarifying questions to posts so that they can better offer assistance to the original poster. We create a dataset of clarification questions consisting of ~77K posts paired with a clarification question (and answer) from three domains of StackExchange: askubuntu, unix and superuser. We evaluate our model on 500 samples of this dataset against expert human judgments and demonstrate significant improvements over controlled baselines.
