Table of Contents
Fetching ...

Grounding Agent Reasoning in Image Schemas: A Neurosymbolic Approach to Embodied Cognition

François Olivier, Zied Bouraoui

TL;DR

This work addresses the lack of grounded, embodied concepts in AI language and reasoning by proposing a neurosymbolic framework grounded in image schemas. It formalizes image schemas using a non-monotonic first-order logic with evaluable functions and temporal operators, enabling qualitative spatial-temporal and force-dynamic reasoning. The approach integrates neural components to parse natural language into these formal representations and uses symbolic solvers (e.g., Clingo) for embodied reasoning and analogical mapping. By grounding language in bodily experience and spatial primitives, the framework aims to improve interpretability, human-agent interaction, and cross-domain reasoning across spatial, temporal, and dynamic relations.

Abstract

Despite advances in embodied AI, agent reasoning systems still struggle to capture the fundamental conceptual structures that humans naturally use to understand and interact with their environment. To address this, we propose a novel framework that bridges embodied cognition theory and agent systems by leveraging a formal characterization of image schemas, which are defined as recurring patterns of sensorimotor experience that structure human cognition. By customizing LLMs to translate natural language descriptions into formal representations based on these sensorimotor patterns, we will be able to create a neurosymbolic system that grounds the agent's understanding in fundamental conceptual structures. We argue that such an approach enhances both efficiency and interpretability while enabling more intuitive human-agent interactions through shared embodied understanding.

Grounding Agent Reasoning in Image Schemas: A Neurosymbolic Approach to Embodied Cognition

TL;DR

This work addresses the lack of grounded, embodied concepts in AI language and reasoning by proposing a neurosymbolic framework grounded in image schemas. It formalizes image schemas using a non-monotonic first-order logic with evaluable functions and temporal operators, enabling qualitative spatial-temporal and force-dynamic reasoning. The approach integrates neural components to parse natural language into these formal representations and uses symbolic solvers (e.g., Clingo) for embodied reasoning and analogical mapping. By grounding language in bodily experience and spatial primitives, the framework aims to improve interpretability, human-agent interaction, and cross-domain reasoning across spatial, temporal, and dynamic relations.

Abstract

Despite advances in embodied AI, agent reasoning systems still struggle to capture the fundamental conceptual structures that humans naturally use to understand and interact with their environment. To address this, we propose a novel framework that bridges embodied cognition theory and agent systems by leveraging a formal characterization of image schemas, which are defined as recurring patterns of sensorimotor experience that structure human cognition. By customizing LLMs to translate natural language descriptions into formal representations based on these sensorimotor patterns, we will be able to create a neurosymbolic system that grounds the agent's understanding in fundamental conceptual structures. We argue that such an approach enhances both efficiency and interpretability while enabling more intuitive human-agent interactions through shared embodied understanding.

Paper Structure

This paper contains 7 sections, 1 figure, 1 table.

Figures (1)

  • Figure 1: In the DSR framework, the parameters of objects are used to define spatial relations between these objects.