ManeuverGPT Agentic Control for Safe Autonomous Stunt Maneuvers
Shawn Azdam, Pranav Doma, Aliasghar Moj Arab
TL;DR
The paper tackles safe execution of high-dynamic evasive maneuvers in autonomous driving by enabling J-turn like reorientation via a prompt-based, LLM-driven control framework called ManeuverGPT. It deploys a three-agent architecture (Query Enricher, Driver, Validator) in a closed-loop loop within the CARLA simulator to generate and validate maneuver parameters without modifying model weights, leveraging a cost function that balances heading accuracy, safety, and smoothness. Key findings show that multi-agent prompting improves success rates and precision across vehicle types, with sedan dynamics generally more forgiving than sports coupes, and that iterative prompt refinement yields convergence toward feasible, safe maneuvers. The work highlights the potential of LLM-driven planning for rapid prototyping of novel autonomous maneuvers while noting the need for formal safety guarantees and hybrid control strategies for real-world deployment.
Abstract
The next generation of active safety features in autonomous vehicles should be capable of safely executing evasive hazard-avoidance maneuvers akin to those performed by professional stunt drivers to achieve high-agility motion at the limits of vehicle handling. This paper presents a novel framework, ManeuverGPT, for generating and executing high-dynamic stunt maneuvers in autonomous vehicles using large language model (LLM)-based agents as controllers. We target aggressive maneuvers, such as J-turns, within the CARLA simulation environment and demonstrate an iterative, prompt-based approach to refine vehicle control parameters, starting tabula rasa without retraining model weights. We propose an agentic architecture comprised of three specialized agents (1) a Query Enricher Agent for contextualizing user commands, (2) a Driver Agent for generating maneuver parameters, and (3) a Parameter Validator Agent that enforces physics-based and safety constraints. Experimental results demonstrate successful J-turn execution across multiple vehicle models through textual prompts that adapt to differing vehicle dynamics. We evaluate performance via established success criteria and discuss limitations regarding numeric precision and scenario complexity. Our findings underscore the potential of LLM-driven control for flexible, high-dynamic maneuvers, while highlighting the importance of hybrid approaches that combine language-based reasoning with algorithmic validation.
