Table of Contents
Fetching ...

Active Task Disambiguation with LLMs

Katarzyna Kobalczyk, Nicolas Astorga, Tennison Liu, Mihaela van der Schaar

TL;DR

This work addresses how large language models handle ambiguously specified tasks by introducing a formal notion of task ambiguity and a Bayesian Experimental Design (BED) based framework for active task disambiguation. The method explicitly samples candidate solutions and candidate clarifying questions, then selects the question that maximizes the expected information gain while accounting for cost, thereby concentrating the solution distribution ${p}_{\phi_h}(\cdot|{\mathcal S})$ toward the true viable set ${\mathcal H}^*$. Empirical results from a 20-questions game and from open-ended code-generation tasks show that BED-based question generation (notably EIG-uniform) significantly outperforms baselines that rely on implicit reasoning about questions, with open questions generally offering higher information gains than yes/no questions. The findings suggest that shifting some reasoning to explicit evaluation over the space of candidate solutions improves task disambiguation and that this approach has broad applicability to interactive AI systems requiring user-guided clarification and more reliable outputs.

Abstract

Despite the impressive performance of large language models (LLMs) across various benchmarks, their ability to address ambiguously specified problems--frequent in real-world interactions--remains underexplored. To address this gap, we introduce a formal definition of task ambiguity and frame the problem of task disambiguation through the lens of Bayesian Experimental Design. By posing clarifying questions, LLM agents can acquire additional task specifications, progressively narrowing the space of viable solutions and reducing the risk of generating unsatisfactory outputs. Yet, generating effective clarifying questions requires LLM agents to engage in a form of meta-cognitive reasoning, an ability LLMs may presently lack. Our proposed approach of active task disambiguation enables LLM agents to generate targeted questions maximizing the information gain. Effectively, this approach shifts the load from implicit to explicit reasoning about the space of viable solutions. Empirical results demonstrate that this form of question selection leads to more effective task disambiguation in comparison to approaches relying on reasoning solely within the space of questions.

Active Task Disambiguation with LLMs

TL;DR

This work addresses how large language models handle ambiguously specified tasks by introducing a formal notion of task ambiguity and a Bayesian Experimental Design (BED) based framework for active task disambiguation. The method explicitly samples candidate solutions and candidate clarifying questions, then selects the question that maximizes the expected information gain while accounting for cost, thereby concentrating the solution distribution toward the true viable set . Empirical results from a 20-questions game and from open-ended code-generation tasks show that BED-based question generation (notably EIG-uniform) significantly outperforms baselines that rely on implicit reasoning about questions, with open questions generally offering higher information gains than yes/no questions. The findings suggest that shifting some reasoning to explicit evaluation over the space of candidate solutions improves task disambiguation and that this approach has broad applicability to interactive AI systems requiring user-guided clarification and more reliable outputs.

Abstract

Despite the impressive performance of large language models (LLMs) across various benchmarks, their ability to address ambiguously specified problems--frequent in real-world interactions--remains underexplored. To address this gap, we introduce a formal definition of task ambiguity and frame the problem of task disambiguation through the lens of Bayesian Experimental Design. By posing clarifying questions, LLM agents can acquire additional task specifications, progressively narrowing the space of viable solutions and reducing the risk of generating unsatisfactory outputs. Yet, generating effective clarifying questions requires LLM agents to engage in a form of meta-cognitive reasoning, an ability LLMs may presently lack. Our proposed approach of active task disambiguation enables LLM agents to generate targeted questions maximizing the information gain. Effectively, this approach shifts the load from implicit to explicit reasoning about the space of viable solutions. Empirical results demonstrate that this form of question selection leads to more effective task disambiguation in comparison to approaches relying on reasoning solely within the space of questions.

Paper Structure

This paper contains 28 sections, 1 theorem, 10 equations, 8 figures, 5 tables, 4 algorithms.

Key Result

Corollary 1

Let ${\mathcal{H}}$ be the set of solutions compatible with the requirements ${\mathcal{R}}$ within the problem statement ${\mathcal{S}}$. Suppose that $p^*(\cdot \vert {\mathcal{S}})$ is uniform on ${\mathcal{H}}$. Let ${\mathcal{Q}}_n$ be the set of all questions with exactly $n$ possible answers

Figures (8)

  • Figure 1: An ambiguous problem statement, a sample of LLM-generated compatible solutions, and clarifying questions. $\blacktriangleright$The goal: Generate the most informative question.
  • Figure 2: Resolving ambiguity.
  • Figure 3: Active task disambiguation.Starting from $t=0$, the problem statement ${\mathcal{S}}^t$ is presented to the problem-solving agent. The agent reasons about the problem to infer the set of solutions ${\mathcal{H}}^t$ compatible with the requirements of ${\mathcal{S}}^t$. In order to approximate ${\mathcal{H}}^t$, a set of candidate solutions $\{h_i^t\}$ is sampled. To discern between different solution variants, the agent generates candidate questions $\{q_j^t\}$. A question $q^*$ with the highest utility is selected and presented to the oracle. Based on the oracle answer, $a^*$, the problem statement is extended by the new specification defined through $(q^*, a^*)$; the process can be repeated with the updated problem statement ${\mathcal{S}}^{t+1} = {\mathcal{S}}^{t} \cup (q^*, a^*)$ resulting in a reduced space of compatible solutions, ${\mathcal{H}}^{t+1} \subset {\mathcal{H}}^t$.
  • Figure 4: Comparison of question-generating strategies on the game of 20 questions. Rankings averaged across 15 ground-truth animals, 5 run seeds, 25 evaluation seeds. See Appendix \ref{['appdx:20q-results']} for results with the Llama family models.
  • Figure 5: Number of valid solutions after each iteration.
  • ...and 3 more figures

Theorems & Definitions (3)

  • Example 1: Code generation
  • Definition 1: Task ambiguity
  • Corollary 1