Table of Contents
Fetching ...

SituatedThinker: Grounding LLM Reasoning with Real-World through Situated Thinking

Junnan Liu, Linhao Luo, Thuy-Trang Vu, Gholamreza Haffari

TL;DR

SituatedThinker addresses the gap where LLM reasoning remains confined to internal parameters and often fails to incorporate real-time external information. It introduces situated thinking, grounding reasoning in external contexts via standardized interfaces and employs reinforcement learning to incentivize deliberate reasoning with feedback from the external world. Empirical results show substantial improvements on multi-hop QA and mathematical reasoning benchmarks and demonstrate strong generalization to unseen tasks and new interfaces without additional training. The work highlights the potential of real-world grounding for robust, adaptable LLM reasoning and points to future directions in multimodal input, multilingual support, and open-ended decision-making tasks.

Abstract

Recent advances in large language models (LLMs) demonstrate their impressive reasoning capabilities. However, the reasoning confined to internal parametric space limits LLMs' access to real-time information and understanding of the physical world. To overcome this constraint, we introduce SituatedThinker, a novel framework that enables LLMs to ground their reasoning in real-world contexts through situated thinking, which adaptively combines both internal knowledge and external information with predefined interfaces. By utilizing reinforcement learning, SituatedThinker incentivizes deliberate reasoning with the real world to acquire information and feedback, allowing LLMs to surpass their knowledge boundaries and enhance reasoning. Experimental results demonstrate significant performance improvements on multi-hop question-answering and mathematical reasoning benchmarks. Furthermore, SituatedThinker demonstrates strong performance on unseen tasks, such as KBQA, TableQA, and text-based games, showcasing the generalizable real-world grounded reasoning capability. Our codes are available at https://github.com/jnanliu/SituatedThinker.

SituatedThinker: Grounding LLM Reasoning with Real-World through Situated Thinking

TL;DR

SituatedThinker addresses the gap where LLM reasoning remains confined to internal parameters and often fails to incorporate real-time external information. It introduces situated thinking, grounding reasoning in external contexts via standardized interfaces and employs reinforcement learning to incentivize deliberate reasoning with feedback from the external world. Empirical results show substantial improvements on multi-hop QA and mathematical reasoning benchmarks and demonstrate strong generalization to unseen tasks and new interfaces without additional training. The work highlights the potential of real-world grounding for robust, adaptable LLM reasoning and points to future directions in multimodal input, multilingual support, and open-ended decision-making tasks.

Abstract

Recent advances in large language models (LLMs) demonstrate their impressive reasoning capabilities. However, the reasoning confined to internal parametric space limits LLMs' access to real-time information and understanding of the physical world. To overcome this constraint, we introduce SituatedThinker, a novel framework that enables LLMs to ground their reasoning in real-world contexts through situated thinking, which adaptively combines both internal knowledge and external information with predefined interfaces. By utilizing reinforcement learning, SituatedThinker incentivizes deliberate reasoning with the real world to acquire information and feedback, allowing LLMs to surpass their knowledge boundaries and enhance reasoning. Experimental results demonstrate significant performance improvements on multi-hop question-answering and mathematical reasoning benchmarks. Furthermore, SituatedThinker demonstrates strong performance on unseen tasks, such as KBQA, TableQA, and text-based games, showcasing the generalizable real-world grounded reasoning capability. Our codes are available at https://github.com/jnanliu/SituatedThinker.

Paper Structure

This paper contains 63 sections, 12 equations, 2 figures, 4 tables.

Figures (2)

  • Figure 1: The framework of SituatedThinker, where LLMs take questions and predefined interfaces as inputs. Then, they conduct situated thinking to adaptively combine basic reasoning with internal action and external reasoning while performing situated actions through the interfaces. The final conclusion is obtained through a deliberate reasoning process and verified to optimize models with reinforcement learning. External world can be presented as knowledge graphs, databases, or the physical environment (like a room space for robot control).
  • Figure 2: Illustration of training dynamics of SituatedThinker. The $x$-axis indicates the training steps and the $y$-axis means the observation metrics.