Table of Contents
Fetching ...

Low-cost Real-world Implementation of the Swing-up Pendulum for Deep Reinforcement Learning Experiments

Peter Böhm, Pauline Pounds, Archie C. Chapman

TL;DR

This work confronts the sim-to-real gap in DRL by delivering a low-cost, real-world inverted pendulum platform paired with a MuJoCo-based digital twin to study real-world delays in sensing, computation, and actuation. The authors present a full hardware stack (arm, pendulum, encoder, servo, couplings, power, MCU) and a Gym-compatible PendulumR environment that bridges hardware and simulation via asynchronous MQTT communication. A key contribution is the R-DQN and R-TD3 demonstrations, which pair a GRU-based sequence encoder with off-policy DRL in both real and simulated settings, illustrating the pronounced gap between real-world and simulated training and the value of non-blocking, delay-aware training. The platform emphasizes accessibility and educational use, enabling students and researchers to design, implement, and evaluate DRL methods in realistic, imperfect robotics settings, with logging and modular interfaces to facilitate further research on sim-to-real transfer and system identification.

Abstract

Deep reinforcement learning (DRL) has had success in virtual and simulated domains, but due to key differences between simulated and real-world environments, DRL-trained policies have had limited success in real-world applications. To assist researchers to bridge the \textit{sim-to-real gap}, in this paper, we describe a low-cost physical inverted pendulum apparatus and software environment for exploring sim-to-real DRL methods. In particular, the design of our apparatus enables detailed examination of the delays that arise in physical systems when sensing, communicating, learning, inferring and actuating. Moreover, we wish to improve access to educational systems, so our apparatus uses readily available materials and parts to reduce cost and logistical barriers. Our design shows how commercial, off-the-shelf electronics and electromechanical and sensor systems, combined with common metal extrusions, dowel and 3D printed couplings provide a pathway for affordable physical DRL apparatus. The physical apparatus is complemented with a simulated environment implemented using a high-fidelity physics engine and OpenAI Gym interface.

Low-cost Real-world Implementation of the Swing-up Pendulum for Deep Reinforcement Learning Experiments

TL;DR

This work confronts the sim-to-real gap in DRL by delivering a low-cost, real-world inverted pendulum platform paired with a MuJoCo-based digital twin to study real-world delays in sensing, computation, and actuation. The authors present a full hardware stack (arm, pendulum, encoder, servo, couplings, power, MCU) and a Gym-compatible PendulumR environment that bridges hardware and simulation via asynchronous MQTT communication. A key contribution is the R-DQN and R-TD3 demonstrations, which pair a GRU-based sequence encoder with off-policy DRL in both real and simulated settings, illustrating the pronounced gap between real-world and simulated training and the value of non-blocking, delay-aware training. The platform emphasizes accessibility and educational use, enabling students and researchers to design, implement, and evaluate DRL methods in realistic, imperfect robotics settings, with logging and modular interfaces to facilitate further research on sim-to-real transfer and system identification.

Abstract

Deep reinforcement learning (DRL) has had success in virtual and simulated domains, but due to key differences between simulated and real-world environments, DRL-trained policies have had limited success in real-world applications. To assist researchers to bridge the \textit{sim-to-real gap}, in this paper, we describe a low-cost physical inverted pendulum apparatus and software environment for exploring sim-to-real DRL methods. In particular, the design of our apparatus enables detailed examination of the delays that arise in physical systems when sensing, communicating, learning, inferring and actuating. Moreover, we wish to improve access to educational systems, so our apparatus uses readily available materials and parts to reduce cost and logistical barriers. Our design shows how commercial, off-the-shelf electronics and electromechanical and sensor systems, combined with common metal extrusions, dowel and 3D printed couplings provide a pathway for affordable physical DRL apparatus. The physical apparatus is complemented with a simulated environment implemented using a high-fidelity physics engine and OpenAI Gym interface.

Paper Structure

This paper contains 27 sections, 1 equation, 6 figures, 1 table.

Figures (6)

  • Figure 1: Pendulum apparatus technical details.
  • Figure 2: 3D printed pendulum coupler.
  • Figure 3: Servo block with a long shaft hub.
  • Figure 4: Simulated environment using MuJoCo physics engine.
  • Figure 5: Learning curves for real-world and simulated environments. The simulated environment is easily solved by all algorithms, however, both plain versions of the DQN and TD3 fail in the real-world. TD3 training lead to servo failure in 2 out of 3 experiments.
  • ...and 1 more figures