Table of Contents
Fetching ...

Flip Stunts on Bicycle Robots using Iterative Motion Imitation

Jeonghwan Kim, Shamel Fahmi, Seungeun Rho, Sehoon Ha, Gabriel Nelson

Abstract

This work demonstrates a front-flip on bicycle robots via reinforcement learning, particularly by imitating reference motions that are infeasible and imperfect. To address this, we propose Iterative Motion Imitation(IMI), a method that iteratively imitates trajectories generated by prior policy rollouts. Starting from an initial reference that is kinematically or dynamically infeasible, IMI helps train policies that lead to feasible and agile behaviors. We demonstrate our method on Ultra-Mobility Vehicle (UMV), a bicycle robot that is designed to enable agile behaviors. From a self-colliding table-to-ground flip reference generated by a model-based controller, we are able to train policies that enable ground-to-ground and ground-to-table front-flips. We show that compared to a single-shot motion imitation, IMI results in policies with higher success rates and can transfer robustly to the real world. To our knowledge, this is the first unassisted acrobatic flip behavior on such a platform.

Flip Stunts on Bicycle Robots using Iterative Motion Imitation

Abstract

This work demonstrates a front-flip on bicycle robots via reinforcement learning, particularly by imitating reference motions that are infeasible and imperfect. To address this, we propose Iterative Motion Imitation(IMI), a method that iteratively imitates trajectories generated by prior policy rollouts. Starting from an initial reference that is kinematically or dynamically infeasible, IMI helps train policies that lead to feasible and agile behaviors. We demonstrate our method on Ultra-Mobility Vehicle (UMV), a bicycle robot that is designed to enable agile behaviors. From a self-colliding table-to-ground flip reference generated by a model-based controller, we are able to train policies that enable ground-to-ground and ground-to-table front-flips. We show that compared to a single-shot motion imitation, IMI results in policies with higher success rates and can transfer robustly to the real world. To our knowledge, this is the first unassisted acrobatic flip behavior on such a platform.

Paper Structure

This paper contains 22 sections, 5 equations, 9 figures, 2 tables.

Figures (9)

  • Figure 2: umv executing a front-flip learned via imi.
  • Figure 3: A 2D overview of the model used in this work.
  • Figure 4: An overview of . The first iteration starts with an initial reference $\xi_0$ used to train policy $\pi_1$. Then, reference $\xi_{n-1}$ generated by policy $\pi_{n-1}$ becomes the new more feasible reference used to train a new policy $\pi_n$. This loop continues, progressively refining the motion until a high-performance, hardware-deployable policy is achieved.
  • Figure 5: Overlayed screenshots of performing a front flip.
  • Figure 6: Comparison of training from scratch ($\xi_0$) versus using refined reference ($\xi_1$). Each run uses 3 seeds. The solid line is the mean and the shaded area is the standard deviation.
  • ...and 4 more figures