Table of Contents
Fetching ...

Sim-to-real transfer of active suspension control using deep reinforcement learning

Viktor Wiberg, Erik Wallin, Arvid Fälldin, Tobias Semberg, Morgan Rossander, Eddie Wadbro, Martin Servin

TL;DR

This work tackles the sim-to-real transfer gap for deep reinforcement learning controllers governing active suspensions on a heavy hydraulic forestry vehicle. It combines multibody dynamics simulation with system identification, domain randomization, and action-delay-aware training to produce policies that transfer to a real Xt28 forwarder. Key findings show that incorporating actuator delays and a penalty for rapid action changes yields smoother, more transferable control, with policies C1 and C2 achieving strong real-world performance across driving, vibration, and ramp scenarios. The results imply that accurate actuator modelling and training-time exposure to delays are crucial for effective sim-to-real transfer in hydraulic, slow-actuation platforms, while perceptual components may still be largely refined within simulation.

Abstract

We explore sim-to-real transfer of deep reinforcement learning controllers for a heavy vehicle with active suspensions designed for traversing rough terrain. While related research primarily focuses on lightweight robots with electric motors and fast actuation, this study uses a forestry vehicle with a complex hydraulic driveline and slow actuation. We simulate the vehicle using multibody dynamics and apply system identification to find an appropriate set of simulation parameters. We then train policies in simulation using various techniques to mitigate the sim-to-real gap, including domain randomization, action delays, and a reward penalty to encourage smooth control. In reality, the policies trained with action delays and a penalty for erratic actions perform nearly at the same level as in simulation. In experiments on level ground, the motion trajectories closely overlap when turning to either side, as well as in a route tracking scenario. When faced with a ramp that requires active use of the suspensions, the simulated and real motions are in close alignment. This shows that the actuator model together with system identification yields a sufficiently accurate model of the actuators. We observe that policies trained without the additional action penalty exhibit fast switching or bang-bang control. These present smooth motions and high performance in simulation but transfer poorly to reality. We find that policies make marginal use of the local height map for perception, showing no indications of predictive planning. However, the strong transfer capabilities entail that further development concerning perception and performance can be largely confined to simulation.

Sim-to-real transfer of active suspension control using deep reinforcement learning

TL;DR

This work tackles the sim-to-real transfer gap for deep reinforcement learning controllers governing active suspensions on a heavy hydraulic forestry vehicle. It combines multibody dynamics simulation with system identification, domain randomization, and action-delay-aware training to produce policies that transfer to a real Xt28 forwarder. Key findings show that incorporating actuator delays and a penalty for rapid action changes yields smoother, more transferable control, with policies C1 and C2 achieving strong real-world performance across driving, vibration, and ramp scenarios. The results imply that accurate actuator modelling and training-time exposure to delays are crucial for effective sim-to-real transfer in hydraulic, slow-actuation platforms, while perceptual components may still be largely refined within simulation.

Abstract

We explore sim-to-real transfer of deep reinforcement learning controllers for a heavy vehicle with active suspensions designed for traversing rough terrain. While related research primarily focuses on lightweight robots with electric motors and fast actuation, this study uses a forestry vehicle with a complex hydraulic driveline and slow actuation. We simulate the vehicle using multibody dynamics and apply system identification to find an appropriate set of simulation parameters. We then train policies in simulation using various techniques to mitigate the sim-to-real gap, including domain randomization, action delays, and a reward penalty to encourage smooth control. In reality, the policies trained with action delays and a penalty for erratic actions perform nearly at the same level as in simulation. In experiments on level ground, the motion trajectories closely overlap when turning to either side, as well as in a route tracking scenario. When faced with a ramp that requires active use of the suspensions, the simulated and real motions are in close alignment. This shows that the actuator model together with system identification yields a sufficiently accurate model of the actuators. We observe that policies trained without the additional action penalty exhibit fast switching or bang-bang control. These present smooth motions and high performance in simulation but transfer poorly to reality. We find that policies make marginal use of the local height map for perception, showing no indications of predictive planning. However, the strong transfer capabilities entail that further development concerning perception and performance can be largely confined to simulation.
Paper Structure (33 sections, 2 equations, 18 figures, 1 table)

This paper contains 33 sections, 2 equations, 18 figures, 1 table.

Figures (18)

  • Figure 1: Xt28 forwarder on the vibration course.
  • Figure 2: Circuits of the hydrostatic transmission.
  • Figure 3: Simplified hydraulic circuit for the suspension of one pendulum arm. The direction control valve above the diesel engine shaft controls the fluid flow to the actuator. Each of the three boxes represents a different valve position, where the centre box indicates the neutral position with both ports closed. The pump and tank are common.
  • Figure 4: Step responses in arm extension and vertical load per arm in simulation compared to those of the real machine.
  • Figure 5: Total load from simulation compared with measurements from the real-world experiments. The load for each curve corresponds to the sum of vertical forces over all six wheels. The test scenario is the same as in Fig. \ref{['fig:cal_run_014_arms']}.
  • ...and 13 more figures