Table of Contents
Fetching ...

CLARA: Classifying and Disambiguating User Commands for Reliable Interactive Robotic Agents

Jeongeun Park, Seungwon Lim, Joonhyung Lee, Sangbeom Park, Minsuk Chang, Youngjae Yu, Sungjoon Choi

TL;DR

CLARA addresses the reliability gap in interpreting natural-language commands for interactive robots by quantifying LLM uncertainty and incorporating robotic situational awareness. It introduces context-sampling and uncertainty-aware prompting to distinguish certain from uncertain commands, followed by a zero-shot feasibility check that splits uncertain commands into ambiguous or infeasible; ambiguous commands are disambiguated through user questions. The SaGC dataset provides scene-grounded labels for evaluating situation-aware uncertainty across multiple robot types and tasks. Across SaGC, tabletop pick-and-place, and real-world handover experiments, CLARA improves uncertainty quantification and command classification accuracy, reducing malfunction risk and enhancing human-robot interaction in practical settings.

Abstract

In this paper, we focus on inferring whether the given user command is clear, ambiguous, or infeasible in the context of interactive robotic agents utilizing large language models (LLMs). To tackle this problem, we first present an uncertainty estimation method for LLMs to classify whether the command is certain (i.e., clear) or not (i.e., ambiguous or infeasible). Once the command is classified as uncertain, we further distinguish it between ambiguous or infeasible commands leveraging LLMs with situational aware context in a zero-shot manner. For ambiguous commands, we disambiguate the command by interacting with users via question generation with LLMs. We believe that proper recognition of the given commands could lead to a decrease in malfunction and undesired actions of the robot, enhancing the reliability of interactive robot agents. We present a dataset for robotic situational awareness, consisting pair of high-level commands, scene descriptions, and labels of command type (i.e., clear, ambiguous, or infeasible). We validate the proposed method on the collected dataset, pick-and-place tabletop simulation. Finally, we demonstrate the proposed approach in real-world human-robot interaction experiments, i.e., handover scenarios.

CLARA: Classifying and Disambiguating User Commands for Reliable Interactive Robotic Agents

TL;DR

CLARA addresses the reliability gap in interpreting natural-language commands for interactive robots by quantifying LLM uncertainty and incorporating robotic situational awareness. It introduces context-sampling and uncertainty-aware prompting to distinguish certain from uncertain commands, followed by a zero-shot feasibility check that splits uncertain commands into ambiguous or infeasible; ambiguous commands are disambiguated through user questions. The SaGC dataset provides scene-grounded labels for evaluating situation-aware uncertainty across multiple robot types and tasks. Across SaGC, tabletop pick-and-place, and real-world handover experiments, CLARA improves uncertainty quantification and command classification accuracy, reducing malfunction risk and enhancing human-robot interaction in practical settings.

Abstract

In this paper, we focus on inferring whether the given user command is clear, ambiguous, or infeasible in the context of interactive robotic agents utilizing large language models (LLMs). To tackle this problem, we first present an uncertainty estimation method for LLMs to classify whether the command is certain (i.e., clear) or not (i.e., ambiguous or infeasible). Once the command is classified as uncertain, we further distinguish it between ambiguous or infeasible commands leveraging LLMs with situational aware context in a zero-shot manner. For ambiguous commands, we disambiguate the command by interacting with users via question generation with LLMs. We believe that proper recognition of the given commands could lead to a decrease in malfunction and undesired actions of the robot, enhancing the reliability of interactive robot agents. We present a dataset for robotic situational awareness, consisting pair of high-level commands, scene descriptions, and labels of command type (i.e., clear, ambiguous, or infeasible). We validate the proposed method on the collected dataset, pick-and-place tabletop simulation. Finally, we demonstrate the proposed approach in real-world human-robot interaction experiments, i.e., handover scenarios.
Paper Structure (46 sections, 1 equation, 11 figures, 14 tables)

This paper contains 46 sections, 1 equation, 11 figures, 14 tables.

Figures (11)

  • Figure 1: Proposed Method. Our method involves estimating uncertainty with LLMs via context sampling to distinguish between certain and uncertain commands. We then leverage situational awareness to classify uncertain commands into ambiguous and infeasible categories, followed by a disambiguation process for ambiguous commands. The number (1) (2), etc., denotes the index of the context from the context set ($C$). $\sigma$ denotes predictive uncertainty, and $\epsilon$ is an uncertainty threshold.
  • Figure 2: Statistics and Examples of the Dataset. Cer. denotes certain, Inf. denotes infeasible and Amb. denotes ambiguous
  • Figure 3: Examples of generated explanation and question from the proposed method. F, R, Q means Feasibility, Reasoning, and Question, respectively.
  • Figure 4: Real-world demonstrations. F, R, Q means Feasibility, Reasoning, and Question, respectively.
  • Figure 5: Failure Cases due to other modules
  • ...and 6 more figures