Table of Contents
Fetching ...

SCoTT: Strategic Chain-of-Thought Tasking for Wireless-Aware Robot Navigation in Digital Twins

Aladin Djuhera, Amin Seffo, Vlad C. Andrei, Holger Boche, Walid Saad

TL;DR

SCoTT tackles the problem of planning robot trajectories under wireless performance constraints by introducing a Strategic Chain-of-Thought Tasking framework that uses multi-modal vision-language models (VLMs) and retrieval-augmented generation to process wireless heatmaps and path gains from a digital twin. It decomposes the planning task into strategy-guided subtasks, enabling grounded reasoning and preventing hallucinations, and can seed a cost-optimal dynamic-programming WA* solver to accelerate search. Empirically, SCoTT achieves path gains within 2% of the optimal DP-WA* while producing shorter trajectories, and can reduce DP-WA* execution time by up to 62% when used as a seed. The approach is validated in ROS/Gazebo simulations, demonstrates compatibility with both large and compact VLMs for on-device deployment, and discusses practical data pipelines and deployment considerations for 6G-enabled digital twins.

Abstract

Path planning under wireless performance constraints is a complex challenge in robot navigation. However, naively incorporating such constraints into classical planning algorithms often incurs prohibitive search costs. In this paper, we propose SCoTT, a wireless-aware path planning framework that leverages vision-language models (VLMs) to co-optimize average path gains and trajectory length using wireless heatmap images and ray-tracing data from a digital twin (DT). At the core of our framework is Strategic Chain-of-Thought Tasking (SCoTT), a novel prompting paradigm that decomposes the exhaustive search problem into structured subtasks, each solved via chain-of-thought prompting. To establish strong baselines, we compare classical A* and wireless-aware extensions of it, and derive DP-WA*, an optimal, iterative dynamic programming algorithm that incorporates all path gains and distance metrics from the DT, but at significant computational cost. In extensive experiments, we show that SCoTT achieves path gains within 2% of DP-WA* while consistently generating shorter trajectories. Moreover, SCoTT's intermediate outputs can be used to accelerate DP-WA* by reducing its search space, saving up to 62% in execution time. We validate our framework using four VLMs, demonstrating effectiveness across both large and small models, thus making it applicable to a wide range of compact models at low inference cost. We also show the practical viability of our approach by deploying SCoTT as a ROS node within Gazebo simulations. Finally, we discuss data acquisition pipelines, compute requirements, and deployment considerations for VLMs in 6G-enabled DTs, underscoring the potential of natural language interfaces for wireless-aware navigation in real-world applications.

SCoTT: Strategic Chain-of-Thought Tasking for Wireless-Aware Robot Navigation in Digital Twins

TL;DR

SCoTT tackles the problem of planning robot trajectories under wireless performance constraints by introducing a Strategic Chain-of-Thought Tasking framework that uses multi-modal vision-language models (VLMs) and retrieval-augmented generation to process wireless heatmaps and path gains from a digital twin. It decomposes the planning task into strategy-guided subtasks, enabling grounded reasoning and preventing hallucinations, and can seed a cost-optimal dynamic-programming WA* solver to accelerate search. Empirically, SCoTT achieves path gains within 2% of the optimal DP-WA* while producing shorter trajectories, and can reduce DP-WA* execution time by up to 62% when used as a seed. The approach is validated in ROS/Gazebo simulations, demonstrates compatibility with both large and compact VLMs for on-device deployment, and discusses practical data pipelines and deployment considerations for 6G-enabled digital twins.

Abstract

Path planning under wireless performance constraints is a complex challenge in robot navigation. However, naively incorporating such constraints into classical planning algorithms often incurs prohibitive search costs. In this paper, we propose SCoTT, a wireless-aware path planning framework that leverages vision-language models (VLMs) to co-optimize average path gains and trajectory length using wireless heatmap images and ray-tracing data from a digital twin (DT). At the core of our framework is Strategic Chain-of-Thought Tasking (SCoTT), a novel prompting paradigm that decomposes the exhaustive search problem into structured subtasks, each solved via chain-of-thought prompting. To establish strong baselines, we compare classical A* and wireless-aware extensions of it, and derive DP-WA*, an optimal, iterative dynamic programming algorithm that incorporates all path gains and distance metrics from the DT, but at significant computational cost. In extensive experiments, we show that SCoTT achieves path gains within 2% of DP-WA* while consistently generating shorter trajectories. Moreover, SCoTT's intermediate outputs can be used to accelerate DP-WA* by reducing its search space, saving up to 62% in execution time. We validate our framework using four VLMs, demonstrating effectiveness across both large and small models, thus making it applicable to a wide range of compact models at low inference cost. We also show the practical viability of our approach by deploying SCoTT as a ROS node within Gazebo simulations. Finally, we discuss data acquisition pipelines, compute requirements, and deployment considerations for VLMs in 6G-enabled DTs, underscoring the potential of natural language interfaces for wireless-aware navigation in real-world applications.

Paper Structure

This paper contains 16 sections, 8 equations, 7 figures, 2 tables, 1 algorithm.

Figures (7)

  • Figure 1: Wireless‐aware path planning objective: the green path is longer but offers better wireless coverage compared to the shorter red path.
  • Figure 2: SCoTT prompt template that divides the complex wireless-aware path planning problem into three subtasks, each solved using strategic CoT prompting.
  • Figure 3: Results for path planning with $G$=0.7: SCoTT immediately moves toward the high gain blue area and overlaps with the optimal DP-WA* therein.
  • Figure 4: Results for path planning with $G$=0.4: All wireless-aware approaches, including N-WA*, avoid the shortest path and instead take a detour.
  • Figure 5: Results for path planning with $G$=0.6: N-WA* biases toward the shortest path while SCoTT and DP flavors effectively optimize for $G$.
  • ...and 2 more figures