HAVEN: Hierarchical Adversary-aware Visibility-Enabled Navigation with Cover Utilization using Deep Transformer Q-Networks
Mihir Chauhan, Damon Conover, Aniket Bera
TL;DR
This work tackles safe navigation under partial observability and adversarial visibility by introducing HAVEN, a hierarchical framework that uses a high-level Deep Transformer Q-Network for subgoal selection and a low-level potential-field controller for reactive execution. The approach employs visibility-aware candidate generation and a compact 16-D feature encoding to reason about occlusion, cover, and adversarial FoVs, enabling memory-enabled decision making. HAVEN demonstrates transfer from 2D to 3D Unity-ROS environments without architectural changes, by projecting perception onto a common feature schema. Empirical results show superior success rates, safety margins, and reduced adversarial exposure compared to classical planners and RL baselines, with ablations confirming the value of temporal memory and visibility design for navigation under uncertainty.
Abstract
Autonomous navigation in partially observable environments requires agents to reason beyond immediate sensor input, exploit occlusion, and ensure safety while progressing toward a goal. These challenges arise in many robotics domains, from urban driving and warehouse automation to defense and surveillance. Classical path planning approaches and memoryless reinforcement learning often fail under limited fields of view (FoVs) and occlusions, committing to unsafe or inefficient maneuvers. We propose a hierarchical navigation framework that integrates a Deep Transformer Q-Network (DTQN) as a high-level subgoal selector with a modular low-level controller for waypoint execution. The DTQN consumes short histories of task-aware features, encoding odometry, goal direction, obstacle proximity, and visibility cues, and outputs Q-values to rank candidate subgoals. Visibility-aware candidate generation introduces masking and exposure penalties, rewarding the use of cover and anticipatory safety. A low-level potential field controller then tracks the selected subgoal, ensuring smooth short-horizon obstacle avoidance. We validate our approach in 2D simulation and extend it directly to a 3D Unity-ROS environment by projecting point-cloud perception into the same feature schema, enabling transfer without architectural changes. Results show consistent improvements over classical planners and RL baselines in success rate, safety margins, and time to goal, with ablations confirming the value of temporal memory and visibility-aware candidate design. These findings highlight a generalizable framework for safe navigation under uncertainty, with broad relevance across robotic platforms.
