Designing a skilled soccer team for RoboCup: exploring skill-set-primitives through reinforcement learning

Miguel Abreu; Luis Paulo Reis; Nuno Lau

Designing a skilled soccer team for RoboCup: exploring skill-set-primitives through reinforcement learning

Miguel Abreu, Luis Paulo Reis, Nuno Lau

TL;DR

This paper presents FC Portugal's skill-set-primitives framework and a symmetry-enhanced PPO (PPO+PSL) for training autonomous humanoid soccer agents in RoboCup 3DSSL. By leveraging MDP symmetries and a modular skill architecture, the approach achieves fast, stable learning of locomotion, dribbling, and high-level Push strategies, culminating in back-to-back RoboCup titles in 2022 and 2023. The authors document extensive training environments, multi-agent learning strategies, and a staged progression from Sprint-Kick to a comprehensive locomotion set, and they release the codebase to the community to enable replication and further innovation. The work demonstrates that symmetry-aware learning and primitive-based skill composition can deliver strong performance while improving sample efficiency and transferability to real-world robotic teams.

Abstract

The RoboCup 3D Soccer Simulation League serves as a competitive platform for showcasing innovation in autonomous humanoid robot agents through simulated soccer matches. Our team, FC Portugal, developed a new codebase from scratch in Python after RoboCup 2021. The team's performance relies on a set of skills centered around novel unifying primitives and a custom, symmetry-extended version of the Proximal Policy Optimization algorithm. Our methods have been thoroughly tested in official RoboCup matches, where FC Portugal has won the last two main competitions, in 2022 and 2023. This paper presents our training framework, as well as a timeline of skills developed using our skill-set-primitives, which considerably improve the sample efficiency and stability of skills, and motivate seamless transitions. We start with a significantly fast sprint-kick developed in 2021 and progress to the most recent skill set, including a multi-purpose omnidirectional walk, a dribble with unprecedented ball control, a solid kick, and a push skill. The push addresses low-level collision scenarios and high-level strategies to increase ball possession. We address the resource-intensive nature of this task through an innovative multi-agent learning approach. Finally, we release the team's codebase to the RoboCup community, providing other teams with a robust and modern foundation upon which they can build new features.

Designing a skilled soccer team for RoboCup: exploring skill-set-primitives through reinforcement learning

TL;DR

Abstract

Paper Structure (29 sections, 4 equations, 14 figures, 35 tables)

This paper contains 29 sections, 4 equations, 14 figures, 35 tables.

Introduction
Background
MDP symmetries
Learning algorithm
3D Simulation League
Related Work
Learning paradigms
Practical applications
Skill-set-primitives
Skill Sets
Sprint-Kick
Locomotion set
Training Environment
Sprint-Kick
Locomotion set
...and 14 more sections

Figures (14)

Figure 1: Timeline of skills (based on skill-set-primitives) that were used in official competitions. Each rhombus indicates the tally of goals directly attributed to each skill in the respective competition
Figure 2: Overview of the robot's primitive-based motion framework and its hierarchical structure. The lower layer comprises skill-set-primitives (blue boxes) for two skill sets: the current locomotion set used by FC Portugal and the Sprint-Kick used in 2021. These primitives have no feedback from the environment and serve as a foundation to develop complex skills from the second layer (white boxes). The high-level Push (yellow box) is the only behavior trained in a multi-agent environment.
Figure 3: Execution of the Sprint-Kick skill set. The robot sprints forward for 1.4 seconds, then changes direction to pursue the ball, and, when closer than 1 meter ($\alpha$), it starts the kick stage.
Figure 4: Example of a locomotion set execution sequence. The Step Baseline operates in the background as the agent walks to the ball, pushes it past the opponent, dribbles to an empty space, gradually reverts to the Step Baseline to transition to walking (a), and positions itself for the kick. It then performs a long kick and falls (b), gets up, and resumes walking.
Figure 5: Sprint-Kick training environment. Symmetry operations abstract the RL algorithm from the left and right sides of the robot, forcing the policy to learn a symmetric behavior. The cycle capture primitive is extracted from Sprint Forward and added to Curved Sprint and Kick.
...and 9 more figures

Designing a skilled soccer team for RoboCup: exploring skill-set-primitives through reinforcement learning

TL;DR

Abstract

Designing a skilled soccer team for RoboCup: exploring skill-set-primitives through reinforcement learning

Authors

TL;DR

Abstract

Table of Contents

Figures (14)