Table of Contents
Fetching ...

CycleRL: Sim-to-Real Deep Reinforcement Learning for Robust Autonomous Bicycle Control

Gelu Liu, Teng Wang, Zhijie Wu, Junliang Wu, Songyuan Li, Xiangwei Zhu

Abstract

Autonomous bicycles offer a promising agile solution for urban mobility and last-mile logistics, however, conventional control strategies often struggle with their underactuated nonlinear dynamics, suffering from sensitivity to model mismatches and limited adaptability to real-world uncertainties. To address this, this paper presents CycleRL, the first sim-to-real deep reinforcement learning framework designed for robust autonomous bicycle control. Our approach trains an end-to-end neural control policy within the high-fidelity NVIDIA Isaac Sim environment, leveraging Proximal Policy Optimization (PPO) to circumvent the need for an explicit dynamics model. The framework features a composite reward function tailored for concurrent balance maintenance, velocity tracking, and steering control. Crucially, systematic domain randomization is employed to bridge the simulation-to-reality gap and facilitate direct transfer. In simulation, CycleRL achieves considerable performance, including a 99.90% balance success rate, a low steering tracking error of 1.15°, and a velocity tracking error of 0.18 m/s. These quantitative results, coupled with successful hardware transfer, validate DRL as an effective paradigm for autonomous bicycle control, offering superior adaptability over traditional methods. Video demonstrations are available at https://anony6f05.github.io/CycleRL/.

CycleRL: Sim-to-Real Deep Reinforcement Learning for Robust Autonomous Bicycle Control

Abstract

Autonomous bicycles offer a promising agile solution for urban mobility and last-mile logistics, however, conventional control strategies often struggle with their underactuated nonlinear dynamics, suffering from sensitivity to model mismatches and limited adaptability to real-world uncertainties. To address this, this paper presents CycleRL, the first sim-to-real deep reinforcement learning framework designed for robust autonomous bicycle control. Our approach trains an end-to-end neural control policy within the high-fidelity NVIDIA Isaac Sim environment, leveraging Proximal Policy Optimization (PPO) to circumvent the need for an explicit dynamics model. The framework features a composite reward function tailored for concurrent balance maintenance, velocity tracking, and steering control. Crucially, systematic domain randomization is employed to bridge the simulation-to-reality gap and facilitate direct transfer. In simulation, CycleRL achieves considerable performance, including a 99.90% balance success rate, a low steering tracking error of 1.15°, and a velocity tracking error of 0.18 m/s. These quantitative results, coupled with successful hardware transfer, validate DRL as an effective paradigm for autonomous bicycle control, offering superior adaptability over traditional methods. Video demonstrations are available at https://anony6f05.github.io/CycleRL/.
Paper Structure (35 sections, 9 equations, 7 figures, 9 tables)

This paper contains 35 sections, 9 equations, 7 figures, 9 tables.

Figures (7)

  • Figure 1: Real world deployment of the proposed autonomous bicycle control framework across different conditions.
  • Figure 2: Sketch of the two-wheeled vehicle persson2021trajectory.
  • Figure 3: Illustration of the reward function design for balancing and steering control. The total reward aggregates performance incentives (green arrows) and control effort penalties (red arrows) into a unified scalar to guide the policy learning.
  • Figure 4: Bicycle and terrain modeling in Isaac Sim.
  • Figure 5: Training curves and convergence analysis.
  • ...and 2 more figures