Learning the Approach During the Short-loading Cycle Using Reinforcement Learning

Carl Borngrund; Ulf Bodin; Henrik Andreasson; Fredrik Sandin

Learning the Approach During the Short-loading Cycle Using Reinforcement Learning

Carl Borngrund, Ulf Bodin, Henrik Andreasson, Fredrik Sandin

TL;DR

The study addresses automating the short-loading cycle in wheel-loader/dump-truck workflows using reinforcement learning. A PPO-based actor-critic agent is trained in a PyBullet-based simulation to approach a dump truck and lift the bucket, with a task-specific reward driving progress toward the target while maintaining lift. The agent is then deployed on a real Volvo L180H without additional training, demonstrating qualitative transfer despite a ~3 s sensor delay and simplified real-vehicle dynamics. The findings show potential for simulation-to-reality automation in this domain, while highlighting the need for expanded control inputs and higher-fidelity sensing for robust, multi-signal automation in varied environments.

Abstract

The short-loading cycle is a repetitive task performed in high quantities, making it a great alternative for automation. In the short-loading cycle, an expert operator navigates towards a pile, fills the bucket with material, navigates to a dump truck, and dumps the material into the tipping body. The operator has to balance the productivity goal while minimising the fuel usage, to maximise the overall efficiency of the cycle. In addition, difficult interactions, such as the tyre-to-surface interaction further complicate the cycle. These types of hard-to-model interactions that can be difficult to address with rule-based systems, together with the efficiency requirements, motivate us to examine the potential of data-driven approaches. In this paper, the possibility of teaching an agent through reinforcement learning to approach a dump truck's tipping body and get in position to dump material in the tipping body is examined. The agent is trained in a 3D simulated environment to perform a simplified navigation task. The trained agent is directly transferred to a real vehicle, to perform the same task, with no additional training. The results indicate that the agent can successfully learn to navigate towards the dump truck with a limited amount of control signals in simulation and when transferred to a real vehicle, exhibits the correct behaviour.

Learning the Approach During the Short-loading Cycle Using Reinforcement Learning

TL;DR

Abstract

Paper Structure (13 sections, 2 equations, 5 figures, 1 table, 1 algorithm)

This paper contains 13 sections, 2 equations, 5 figures, 1 table, 1 algorithm.

Introduction
Related Work & Problem Analysis
Preliminaries & theory
The short-loading cycle
Reinforcement learning
Proximal policy optimization
Experimental Setup
Wheel loader modelling
Simulated environment setup
Real environment setup
Results
Discussion
Conclusion

Figures (5)

Figure 1: This is an overview of the short-loading cycle, where the objective is for the operator of the wheel loader to transfer material from a pile to the tipping body of a dump truck. The short-loading cycle consists of three tasks required for its completion: a) scooping, b) navigation, and c) dumping.
Figure 2: The simulated setup where the agent should learn how to drive forward while lifting the bucket.
Figure 3: Visualisation of joints of the L150H wheel loader.
Figure 4: Reward curve for the agent trained in simulation. The agent was trained for $3\cdot10^6$ timesteps.
Figure 5: Results from the simulation (orange) and real vehicle (blue). The brake and lift are the predicted actions from the agent while the observation consists of velocity, distance to the stopping point in x, distance to the stopping point in y, and lift angle. All values have been normalised due to the use of a proprietary interface. For example, the top speed of the wheel loader in the simulation was 2 m/s which is normalised to 1 m/s in the figure.

Learning the Approach During the Short-loading Cycle Using Reinforcement Learning

TL;DR

Abstract

Learning the Approach During the Short-loading Cycle Using Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (5)