Table of Contents
Fetching ...

Sim-to-Real Transfer for Mobile Robots with Reinforcement Learning: from NVIDIA Isaac Sim to Gazebo and Real ROS 2 Robots

Sahar Salimpour, Jorge Peña-Queralta, Diego Paez-Granados, Jukka Heikkonen, Tomi Westerlund

TL;DR

This work tackles the challenge of sim-to-real transfer for end-to-end RL-based local navigation by leveraging NVIDIA Isaac Sim for training and Gazebo with ROS 2 for testing and real deployment. It details a full workflow—from robot/import setup and task definition to ONNX-based inference in ROS 2 nodes—alongside curriculum learning and network variants (MLP and LSTM) to improve dynamic obstacle handling. The study benchmarks against Nav2 and demonstrates zero-shot transfer to real robots (e.g., TurtleBot4 Lite), illustrating that end-to-end RL can achieve competitive performance and rapid deployment on custom platforms. The results highlight practical implications for low-code, RL-driven mobile robot navigation while acknowledging current limitations in dynamic, real-world generalization and the need for reward tuning and further training strategies.

Abstract

Unprecedented agility and dexterous manipulation have been demonstrated with controllers based on deep reinforcement learning (RL), with a significant impact on legged and humanoid robots. Modern tooling and simulation platforms, such as NVIDIA Isaac Sim, have been enabling such advances. This article focuses on demonstrating the applications of Isaac in local planning and obstacle avoidance as one of the most fundamental ways in which a mobile robot interacts with its environments. Although there is extensive research on proprioception-based RL policies, the article highlights less standardized and reproducible approaches to exteroception. At the same time, the article aims to provide a base framework for end-to-end local navigation policies and how a custom robot can be trained in such simulation environment. We benchmark end-to-end policies with the state-of-the-art Nav2, navigation stack in Robot Operating System (ROS). We also cover the sim-to-real transfer process by demonstrating zero-shot transferability of policies trained in the Isaac simulator to real-world robots. This is further evidenced by the tests with different simulated robots, which show the generalization of the learned policy. Finally, the benchmarks demonstrate comparable performance to Nav2, opening the door to quick deployment of state-of-the-art end-to-end local planners for custom robot platforms, but importantly furthering the possibilities by expanding the state and action spaces or task definitions for more complex missions. Overall, with this article we introduce the most important steps, and aspects to consider, in deploying RL policies for local path planning and obstacle avoidance with Isaac Sim training, Gazebo testing, and ROS 2 for real-time inference in real robots. The code is available at https://github.com/sahars93/RL-Navigation.

Sim-to-Real Transfer for Mobile Robots with Reinforcement Learning: from NVIDIA Isaac Sim to Gazebo and Real ROS 2 Robots

TL;DR

This work tackles the challenge of sim-to-real transfer for end-to-end RL-based local navigation by leveraging NVIDIA Isaac Sim for training and Gazebo with ROS 2 for testing and real deployment. It details a full workflow—from robot/import setup and task definition to ONNX-based inference in ROS 2 nodes—alongside curriculum learning and network variants (MLP and LSTM) to improve dynamic obstacle handling. The study benchmarks against Nav2 and demonstrates zero-shot transfer to real robots (e.g., TurtleBot4 Lite), illustrating that end-to-end RL can achieve competitive performance and rapid deployment on custom platforms. The results highlight practical implications for low-code, RL-driven mobile robot navigation while acknowledging current limitations in dynamic, real-world generalization and the need for reward tuning and further training strategies.

Abstract

Unprecedented agility and dexterous manipulation have been demonstrated with controllers based on deep reinforcement learning (RL), with a significant impact on legged and humanoid robots. Modern tooling and simulation platforms, such as NVIDIA Isaac Sim, have been enabling such advances. This article focuses on demonstrating the applications of Isaac in local planning and obstacle avoidance as one of the most fundamental ways in which a mobile robot interacts with its environments. Although there is extensive research on proprioception-based RL policies, the article highlights less standardized and reproducible approaches to exteroception. At the same time, the article aims to provide a base framework for end-to-end local navigation policies and how a custom robot can be trained in such simulation environment. We benchmark end-to-end policies with the state-of-the-art Nav2, navigation stack in Robot Operating System (ROS). We also cover the sim-to-real transfer process by demonstrating zero-shot transferability of policies trained in the Isaac simulator to real-world robots. This is further evidenced by the tests with different simulated robots, which show the generalization of the learned policy. Finally, the benchmarks demonstrate comparable performance to Nav2, opening the door to quick deployment of state-of-the-art end-to-end local planners for custom robot platforms, but importantly furthering the possibilities by expanding the state and action spaces or task definitions for more complex missions. Overall, with this article we introduce the most important steps, and aspects to consider, in deploying RL policies for local path planning and obstacle avoidance with Isaac Sim training, Gazebo testing, and ROS 2 for real-time inference in real robots. The code is available at https://github.com/sahars93/RL-Navigation.
Paper Structure (20 sections, 2 equations, 15 figures, 1 table)

This paper contains 20 sections, 2 equations, 15 figures, 1 table.

Figures (15)

  • Figure 1: Conceptual illustration of the sim-to-real workflow described in this article. In the first step, we utilize different existing robot models, while also describing the Isaac model importer functionality in Section 3. In the second step, we describe key considerations in terms of RL policy training in Section 4, and the setup of different static and dynamic environments. In the third step, we provide template Robot Operating System (ROS 2) nodes, and guidance on Gazebo testing in Sections 4 and 5. Additionally, we benchmark the performance to the state-of-the-art Nav2 navigation and planning algorithms. Finally, in the fourth step, we also demonstrate the zero-shot sim-to-real transfer capabilities in Section 5.
  • Figure 2: Robots and Environments
  • Figure 3: General structure of a new RL task definition in OmniIsaacGym.
  • Figure 4: Core components of task.yaml.
  • Figure 5: Core components of train.yaml.
  • ...and 10 more figures