Table of Contents
Fetching ...

WROOM: An Autonomous Driving Approach for Off-Road Navigation

Dvij Kalaria, Shreya Sharma, Sarthak Bhagat, Haoru Xue, John M. Dolan

TL;DR

The paper tackles robust off-road navigation for wheeled robots, where traditional planners struggle with uncertain terrain. It proposes WROOM, an end-to-end RL framework trained in the OffTerSim simulator using PPO, with imitation warm-start, a Control Barrier Function safety shield, and policy distillation to enable real-world deployment. Key contributions include the OffTerSim simulator, a safety-aware learning pipeline, and demonstrated sim-to-real transfer on a 1/10-scale RC car, aided by domain randomization and distillation. The work advances practical off-road autonomy by integrating learning, safety, and sim-to-real transfer to achieve smoother, safer navigation on challenging terrain.

Abstract

Off-road navigation is a challenging problem both at the planning level to get a smooth trajectory and at the control level to avoid flipping over, hitting obstacles, or getting stuck at a rough patch. There have been several recent works using classical approaches involving depth map prediction followed by smooth trajectory planning and using a controller to track it. We design an end-to-end reinforcement learning (RL) system for an autonomous vehicle in off-road environments using a custom-designed simulator in the Unity game engine. We warm-start the agent by imitating a rule-based controller and utilize Proximal Policy Optimization (PPO) to improve the policy based on a reward that incorporates Control Barrier Functions (CBF), facilitating the agent's ability to generalize effectively to real-world scenarios. The training involves agents concurrently undergoing domain-randomized trials in various environments. We also propose a novel simulation environment to replicate off-road driving scenarios and deploy our proposed approach on a real buggy RC car. Videos and additional results: https://sites.google.com/view/wroom-utd/home

WROOM: An Autonomous Driving Approach for Off-Road Navigation

TL;DR

The paper tackles robust off-road navigation for wheeled robots, where traditional planners struggle with uncertain terrain. It proposes WROOM, an end-to-end RL framework trained in the OffTerSim simulator using PPO, with imitation warm-start, a Control Barrier Function safety shield, and policy distillation to enable real-world deployment. Key contributions include the OffTerSim simulator, a safety-aware learning pipeline, and demonstrated sim-to-real transfer on a 1/10-scale RC car, aided by domain randomization and distillation. The work advances practical off-road autonomy by integrating learning, safety, and sim-to-real transfer to achieve smoother, safer navigation on challenging terrain.

Abstract

Off-road navigation is a challenging problem both at the planning level to get a smooth trajectory and at the control level to avoid flipping over, hitting obstacles, or getting stuck at a rough patch. There have been several recent works using classical approaches involving depth map prediction followed by smooth trajectory planning and using a controller to track it. We design an end-to-end reinforcement learning (RL) system for an autonomous vehicle in off-road environments using a custom-designed simulator in the Unity game engine. We warm-start the agent by imitating a rule-based controller and utilize Proximal Policy Optimization (PPO) to improve the policy based on a reward that incorporates Control Barrier Functions (CBF), facilitating the agent's ability to generalize effectively to real-world scenarios. The training involves agents concurrently undergoing domain-randomized trials in various environments. We also propose a novel simulation environment to replicate off-road driving scenarios and deploy our proposed approach on a real buggy RC car. Videos and additional results: https://sites.google.com/view/wroom-utd/home
Paper Structure (14 sections, 6 equations, 9 figures, 1 table)

This paper contains 14 sections, 6 equations, 9 figures, 1 table.

Figures (9)

  • Figure 1: Overview of the proposed approach, WROOM: The end-to-end RL agent collects depth camera and IMU measurements from the environment and outputs steering, throttle, and braking commands. The reward function evaluates not only the agent's progress in the environment but also the smoothness and safety of its maneuvers, as reported by the control barrier function (CBF). Imitation learning is utilized to kick-start the agent, followed by Proximal Policy Optimization (PPO) to refine its policy, and finally, policy distillation is employed for real-world deployment.
  • Figure 2: Scandots (in purple) as privileged information to the expert controller.
  • Figure 3: Real-world deployment using an RC car.
  • Figure 4: Stills from the proposed simulation environment, OffTerSim.
  • Figure 5: Real-world deployment of our proposed approach, WROOM.
  • ...and 4 more figures