ATLASv2: LLM-Guided Adaptive Landmark Acquisition and Navigation on the Edge

Mikolaj Walczak; Uttej Kallakuri; Tinoosh Mohsenin

ATLASv2: LLM-Guided Adaptive Landmark Acquisition and Navigation on the Edge

Mikolaj Walczak, Uttej Kallakuri, Tinoosh Mohsenin

TL;DR

ATLASv2 addresses the challenge of enabling complex, multi-task autonomous navigation and manipulation entirely on resource-constrained edge hardware. It fuses a compact fine-tuned TinyLLaMA LLM with real-time object detection (YOLOv5n with TensorRT) and a ROS-based path planner to operate on the Jetson Nano, while dynamically expanding a knowledge base of landmarks from detected objects. Key contributions include a fully onboard architecture, dynamic KB-building through environmental perception, and a low-latency, resource-conscious scheduling strategy demonstrated in real-world home and office-like settings, with onboard-LMM performance benchmarked against a cloud-LMM baseline. The results show that the onboard system can decompose high-level natural language tasks into low-level actions and execute them with competitive fidelity, while preserving privacy and independence from network access, albeit with higher latency and memory pressure than cloud-based solutions. This work advances practical edge-enabled embodied AI by bridging simulation-to-real-world deployment for hierarchical navigation and manipulation.

Abstract

Autonomous systems deployed on edge devices face significant challenges, including resource constraints, real-time processing demands, and adapting to dynamic environments. This work introduces ATLASv2, a novel system that integrates a fine-tuned TinyLLM, real-time object detection, and efficient path planning to enable hierarchical, multi-task navigation and manipulation all on the edge device, Jetson Nano. ATLASv2 dynamically expands its navigable landmarks by detecting and localizing objects in the environment which are saved to its internal knowledge base to be used for future task execution. We evaluate ATLASv2 in real-world environments, including a handcrafted home and office setting constructed with diverse objects and landmarks. Results show that ATLASv2 effectively interprets natural language instructions, decomposes them into low-level actions, and executes tasks with high success rates. By leveraging generative AI in a fully on-board framework, ATLASv2 achieves optimized resource utilization with minimal prompting latency and power consumption, bridging the gap between simulated environments and real-world applications.

ATLASv2: LLM-Guided Adaptive Landmark Acquisition and Navigation on the Edge

TL;DR

Abstract

ATLASv2: LLM-Guided Adaptive Landmark Acquisition and Navigation on the Edge

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)