Table of Contents
Fetching ...

Emergence: Overcoming Privileged Information Bias in Asymmetric Embodied Agents via Active Querying

Shaun Baek, Sam Liu, Joseph Ukpong

TL;DR

This work investigates Privileged Information Bias in asymmetric embodied AI by embedding a dual-role Leader and Follower within AI2-THOR. It contrasts Push (open-loop) and Pull (closed-loop) interaction protocols, demonstrating that active querying substantially reduces grounding failures and closes part of the performance gap. The study quantifies a significant Success Gap between Leader perception and team success and shows that Pull-based uncertainty reduction is essential for safer human-AI and robot-robot collaboration. The findings highlight the need for epistemic mechanisms in embodied systems and suggest practical directions, such as incentivizing questioning and sharing visual context, to improve real-world cooperative autonomy.

Abstract

Large Language Models (LLMs) act as powerful reasoning engines but struggle with "symbol grounding" in embodied environments, particularly when information is asymmetrically distributed. We investigate the Privileged Information Bias (or "Curse of Knowledge"), where a knowledgeable "Leader" agent fails to guide a sensor-limited "Follower" due to a lack of Theory of Mind. To quantify this phenomenon, we propose a novel Asymmetric Assistive Reasoning framework within AI2-THOR. Our experiments reveal a significant "Success Gap": while the Leader successfully perceives the target in 35.0% of episodes, the collaborative team succeeds only 17.0% of the time, implying that nearly 50% of feasible plans fail solely due to communicative grounding errors. We demonstrate that a "Pull-based" protocol (active querying) is significantly more robust than standard "Push-based" instruction, with successful episodes featuring 2x the frequency of clarification requests. This research isolates the mechanism of active uncertainty reduction as a prerequisite for safe human-AI and robot-robot collaboration.

Emergence: Overcoming Privileged Information Bias in Asymmetric Embodied Agents via Active Querying

TL;DR

This work investigates Privileged Information Bias in asymmetric embodied AI by embedding a dual-role Leader and Follower within AI2-THOR. It contrasts Push (open-loop) and Pull (closed-loop) interaction protocols, demonstrating that active querying substantially reduces grounding failures and closes part of the performance gap. The study quantifies a significant Success Gap between Leader perception and team success and shows that Pull-based uncertainty reduction is essential for safer human-AI and robot-robot collaboration. The findings highlight the need for epistemic mechanisms in embodied systems and suggest practical directions, such as incentivizing questioning and sharing visual context, to improve real-world cooperative autonomy.

Abstract

Large Language Models (LLMs) act as powerful reasoning engines but struggle with "symbol grounding" in embodied environments, particularly when information is asymmetrically distributed. We investigate the Privileged Information Bias (or "Curse of Knowledge"), where a knowledgeable "Leader" agent fails to guide a sensor-limited "Follower" due to a lack of Theory of Mind. To quantify this phenomenon, we propose a novel Asymmetric Assistive Reasoning framework within AI2-THOR. Our experiments reveal a significant "Success Gap": while the Leader successfully perceives the target in 35.0% of episodes, the collaborative team succeeds only 17.0% of the time, implying that nearly 50% of feasible plans fail solely due to communicative grounding errors. We demonstrate that a "Pull-based" protocol (active querying) is significantly more robust than standard "Push-based" instruction, with successful episodes featuring 2x the frequency of clarification requests. This research isolates the mechanism of active uncertainty reduction as a prerequisite for safe human-AI and robot-robot collaboration.

Paper Structure

This paper contains 48 sections, 3 equations, 5 figures, 6 tables, 1 algorithm.

Figures (5)

  • Figure 1: The Leader-Follower Architecture. The Leader utilizes global perception ($S_L$) to "push" instructions, while the Follower utilizes local verification ($S_F$) to "pull" clarification. The shared LLM core iteratively generates the internal monologue for both roles.
  • Figure 2: Visualizing the Information Asymmetry (Task 23). The Leader (left) perceives the full scene geometry (12 objects visible), identifying the target "Apple" relative to the room layout. The Follower (right) operates under a 2.0m visibility handicap (0 objects visible), seeing only a blank wall. This discrepancy creates the "Curse of Knowledge," where the Leader must infer the Follower's blindness to provide effective guidance.
  • Figure 3: The baseline agent's performance (16.0% SR) illustrates the "Zero-Shot Ceiling," where success is largely determined by favorable spawn locations rather than systematic search.
  • Figure 4: The handicapped agent's performance drop (to 11.0% SR) quantifies the "Sensory Tax," confirming that semantic reasoning cannot compensate for a lack of distal visual cues.
  • Figure 5: The impact of active querying: Successful episodes (Green) feature 2x the frequency of "Pull" requests compared to failed episodes (Red), validating the "Push-Pull" hypothesis.