Creating a Dynamic Quadrupedal Robotic Goalkeeper with Reinforcement Learning

Xiaoyu Huang; Zhongyu Li; Yanzhen Xiang; Yiming Ni; Yufeng Chi; Yunhao Li; Lizhi Yang; Xue Bin Peng; Koushil Sreenath

Creating a Dynamic Quadrupedal Robotic Goalkeeper with Reinforcement Learning

Xiaoyu Huang, Zhongyu Li, Yanzhen Xiang, Yiming Ni, Yufeng Chi, Yunhao Li, Lizhi Yang, Xue Bin Peng, Koushil Sreenath

TL;DR

This work tackles the problem of enabling a quadrupedal robot to function as a soccer goalkeeper by integrating multiple dynamic locomotion skills with a high-level motion planner through a hierarchical reinforcement learning framework. The authors develop three skill-specific controllers (sidestep, dive, jump) parameterized by Bézier curves and a planning policy that selects the most suitable skill and trajectory based on ball and robot states, all validated with zero-shot sim-to-real transfer on a Mini Cheetah. Results show substantial improvements in interception performance over single-skill baselines, achieving up to 87.5% real-world saves and demonstrating the system's ability to handle a range of ball trajectories and speeds. The approach advances the feasibility of real-time, multi-skill, dynamic legged robotics in unstructured tasks like goalkeeping, with implications for broader autonomous sports robotics.

Abstract

We present a reinforcement learning (RL) framework that enables quadrupedal robots to perform soccer goalkeeping tasks in the real world. Soccer goalkeeping using quadrupeds is a challenging problem, that combines highly dynamic locomotion with precise and fast non-prehensile object (ball) manipulation. The robot needs to react to and intercept a potentially flying ball using dynamic locomotion maneuvers in a very short amount of time, usually less than one second. In this paper, we propose to address this problem using a hierarchical model-free RL framework. The first component of the framework contains multiple control policies for distinct locomotion skills, which can be used to cover different regions of the goal. Each control policy enables the robot to track random parametric end-effector trajectories while performing one specific locomotion skill, such as jump, dive, and sidestep. These skills are then utilized by the second part of the framework which is a high-level planner to determine a desired skill and end-effector trajectory in order to intercept a ball flying to different regions of the goal. We deploy the proposed framework on a Mini Cheetah quadrupedal robot and demonstrate the effectiveness of our framework for various agile interceptions of a fast-moving ball in the real world.

Creating a Dynamic Quadrupedal Robotic Goalkeeper with Reinforcement Learning

TL;DR

Abstract

Paper Structure (37 sections, 5 equations, 8 figures)

This paper contains 37 sections, 5 equations, 8 figures.

Introduction
Related Work
Robotic Catching/Hitting of Fast Moving Objects
Dynamic Locomotion Control for Quadrupeds
Legged Robot Soccer
Contributions
Hierarchical RL Framework for Goalkeeping Task with Multi-Skills
Hierarchical RL Framework for Goalkeeping Task with Multi-Skills
The Mini Cheetah Quadrupedal Robot
Locomotion Skills for Goalkeeping
Sidestep
Dive
Jump
Parameterize Multiple Skills using Bézier Curves
Hierarchical Reinforcement Learning Framework
...and 22 more sections

Figures (8)

Figure 1: A quadrupedal robot goalkeeper, Mini Cheetah, saves a flying soccer ball towards the goal using the proposed hierarchical RL framework with multiple locomotion control policies and a motion planning policy. The ball flying time is only around $0.5$ second. Video is at https://youtu.be/iX6OgG67-ZQ.
Figure 2: Proposed hierarchical reinforcement learning framework for creating a quadrupedal robotic goalkeeper. We firstly develop a set of locomotion control policies for different skills, such as sidestep, dive, and jump. The locomotion control policies are designed to follow random parametric Bézier curve using the robot end-effectors (swing front toes). The controller outputs desired motor position at $30$ Hz for the joint-level PD controller to generate motor torques, after passing through a Low Pass Filter (LPF) peng2020learningli2021reinforcement. A motion planner running at $10$ Hz is developed on top of the multiple skill-specific controllers to select the specific skill to perform as well as the desired end-effector trajectory for the controller to track. The goal of the planner is to enable the robot to intercept the ball via its body before the goal. The controllers and planners are trained by RL and the ball position is detected by a deep neural network using a RGB-Depth camera ($30$ Hz).
Figure 3: Three different locomotion skills for goalkeeping. The robot can use different skills to cover different regions of the goal.
Figure 4: Snapshots and shot interception map in simulation with more skills added from left to right. The map represents the goal region. Green records a goal save while red is a goal (miss). Darker colors indicates faster ball speeds. The snapshots visualize how the planner leverages the new skills, and the shot interception map quantitatively illustrate the benefits of adding each skill. Note that the failing corner cases are noticeably reduced by adding the second sidestep skill in \ref{['fig:2skillsim']}, and further reduced by the third dive skill in \ref{['fig:3skillsim']}. The goal saving rates are 65.09%, 72.46%, and 78.11%, respectively.
Figure 5: Experiments with control policies for different skills. The policy is able to directly transfer to the hardware. As designed in Sec. \ref{['subsec:skills']}, we can observe that the dive skill \ref{['fig:divemocap']} is able to reach a significantly larger range horizontally than sidestep \ref{['fig:swmocap']}, while the jump skill \ref{['fig:jumpmocap']} can produce a notable period of flight time, swing the front legs to cover more upper-altitude area, and land safely.
...and 3 more figures

Creating a Dynamic Quadrupedal Robotic Goalkeeper with Reinforcement Learning

TL;DR

Abstract

Creating a Dynamic Quadrupedal Robotic Goalkeeper with Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (8)