SENT Map -- Semantically Enhanced Topological Maps with Foundation Models
Raj Surya Rajendran Kathirvel, Zach A Chavis, Stephen J. Guy, Karthik Desingh
TL;DR
SENT-Map addresses the challenge of semantically grounding autonomous indoor navigation by grounding foundation-model planning in a topological map. It introduces a JSON-based SENT-Map representation that combines a two-stage workflow: operator-guided mapping with a Vision-FM to create a human-editable map ${\mathcal{M}}=G(V,E)$ with semantic subset $V_{SE}$, and Planning-FM-driven planning that converts the map and natural-language queries into grounded task plans constrained by the robot's skills. The key contributions are the SENT-Map representation, a framework for human-guided map construction, and an end-to-end planning approach that remains robust even for small locally-deployable FMs, demonstrated on tasks requiring semantic reasoning and object ownership. This work enables reliable planning in open-world indoor environments and provides a transparent, editable semantic representation that facilitates verification and refinement by humans.
Abstract
We introduce SENT-Map, a semantically enhanced topological map for representing indoor environments, designed to support autonomous navigation and manipulation by leveraging advancements in foundational models (FMs). Through representing the environment in a JSON text format, we enable semantic information to be added and edited in a format that both humans and FMs understand, while grounding the robot to existing nodes during planning to avoid infeasible states during deployment. Our proposed framework employs a two stage approach, first mapping the environment alongside an operator with a Vision-FM, then using the SENT-Map representation alongside a natural-language query within an FM for planning. Our experimental results show that semantic-enhancement enables even small locally-deployable FMs to successfully plan over indoor environments.
