Active Confusion Expression in Large Language Models: Leveraging World Models toward Better Social Reasoning
Jialu Du, Guiyang Hou, Yihui Fu, Chen Wu, Wenqi Zhang, Yongliang Shen, Weiming Lu
TL;DR
This work addresses the challenge of social reasoning in large language models, where cognitive confusion and conflation of objective events with agents’ beliefs hinder reliable judgment. It proposes an adaptive world-model-enhanced reasoning framework with a cognitive-intervention trigger that constructs a dynamic textual world model and intervenes when confusion signals appear, guiding the model back to coherent reasoning. The approach yields measurable improvements in accuracy and substantial token-efficiency gains across ToMi, Hi-ToM, and ExploreToM benchmarks, with larger models benefiting more from the interventions. By explicitly separating external world states from internal beliefs through stateful world models, the method offers a practical path to deploying LLMs in real-world social reasoning tasks.
Abstract
While large language models (LLMs) excel in mathematical and code reasoning, we observe they struggle with social reasoning tasks, exhibiting cognitive confusion, logical inconsistencies, and conflation between objective world states and subjective belief states. Through deteiled analysis of DeepSeek-R1's reasoning trajectories, we find that LLMs frequently encounter reasoning impasses and tend to output contradictory terms like "tricky" and "confused" when processing scenarios with multiple participants and timelines, leading to erroneous reasoning or infinite loops. The core issue is their inability to disentangle objective reality from agents' subjective beliefs. To address this, we propose an adaptive world model-enhanced reasoning mechanism that constructs a dynamic textual world model to track entity states and temporal sequences. It dynamically monitors reasoning trajectories for confusion indicators and promptly intervenes by providing clear world state descriptions, helping models navigate through cognitive dilemmas. The mechanism mimics how humans use implicit world models to distinguish between external events and internal beliefs. Evaluations on three social benchmarks demonstrate significant improvements in accuracy (e.g., +10% in Hi-ToM) while reducing computational costs (up to 33.8% token reduction), offering a simple yet effective solution for deploying LLMs in social contexts.
