Diffusion-Reinforcement Learning Hierarchical Motion Planning in Multi-agent Adversarial Games
Zixuan Wu, Sean Ye, Manisha Natarajan, Matthew C. Gombolay
TL;DR
Problem: evasive motion planning in large, partially observable, multi-agent pursuit-evasion settings. Approach: a hierarchical diffusion-RL framework with a diffusion-based global planner guiding a low-level SAC evasion policy, plus a cost-map based path selection. Contributions: improved detection/goal-reaching metrics, interpretability and flexibility via the cost map, efficiency gains, and generalizability including real-robot demonstration. Significance: enables robust, scalable evasive navigation in realistic adversarial environments.
Abstract
Reinforcement Learning (RL)-based motion planning has recently shown the potential to outperform traditional approaches from autonomous navigation to robot manipulation. In this work, we focus on a motion planning task for an evasive target in a partially observable multi-agent adversarial pursuit-evasion game (PEG). Pursuit-evasion problems are relevant to various applications, such as search and rescue operations and surveillance robots, where robots must effectively plan their actions to gather intelligence or accomplish mission tasks while avoiding detection or capture. We propose a hierarchical architecture that integrates a high-level diffusion model to plan global paths responsive to environment data, while a low-level RL policy reasons about evasive versus global path-following behavior. The benchmark results across different domains and different observability show that our approach outperforms baselines by 77.18% and 47.38% on detection and goal reaching rate, which leads to 51.4% increasing of the performance score on average. Additionally, our method improves interpretability, flexibility and efficiency of the learned policy.
