Table of Contents
Fetching ...

Discovery and Deployment of Emergent Robot Swarm Behaviors via Representation Learning and Real2Sim2Real Transfer

Connor Mattson, Varun Raveendra, Ricardo Vega, Cameron Nowzari, Daniel S. Drew, Daniel S. Brown

TL;DR

This work presents Real2Sim2Real Behavior Discovery via Self-Supervised Representation Learning, a pipeline that automatically uncovers emergent swarm behaviors and directly deploys them on real robots by bridging reality and simulation with RSRS. It replaces hand-crafted behavioral metrics with a self-supervised learned embedding (via a SimCLR-style framework) and uses novelty search to explore the swarm behavior space, followed by clustering to select deployable candidates. The authors demonstrate deployable emergent behaviors on a low-cost HeRo+ swarm, achieving high one-shot and multi-attempt deployment success, and show that RSRS is critical to avoiding non-deployable artifacts that arise in naive simulators. The approach advances swarm robotics by enabling scalable, automatic discovery of usable collective behaviors with practical real-world deployment.

Abstract

Given a swarm of limited-capability robots, we seek to automatically discover the set of possible emergent behaviors. Prior approaches to behavior discovery rely on human feedback or hand-crafted behavior metrics to represent and evolve behaviors and only discover behaviors in simulation, without testing or considering the deployment of these new behaviors on real robot swarms. In this work, we present Real2Sim2Real Behavior Discovery via Self-Supervised Representation Learning, which combines representation learning and novelty search to discover possible emergent behaviors automatically in simulation and enable direct controller transfer to real robots. First, we evaluate our method in simulation and show that our proposed self-supervised representation learning approach outperforms previous hand-crafted metrics by more accurately representing the space of possible emergent behaviors. Then, we address the reality gap by incorporating recent work in sim2real transfer for swarms into our lightweight simulator design, enabling direct robot deployment of all behaviors discovered in simulation on an open-source and low-cost robot platform.

Discovery and Deployment of Emergent Robot Swarm Behaviors via Representation Learning and Real2Sim2Real Transfer

TL;DR

This work presents Real2Sim2Real Behavior Discovery via Self-Supervised Representation Learning, a pipeline that automatically uncovers emergent swarm behaviors and directly deploys them on real robots by bridging reality and simulation with RSRS. It replaces hand-crafted behavioral metrics with a self-supervised learned embedding (via a SimCLR-style framework) and uses novelty search to explore the swarm behavior space, followed by clustering to select deployable candidates. The authors demonstrate deployable emergent behaviors on a low-cost HeRo+ swarm, achieving high one-shot and multi-attempt deployment success, and show that RSRS is critical to avoiding non-deployable artifacts that arise in naive simulators. The approach advances swarm robotics by enabling scalable, automatic discovery of usable collective behaviors with practical real-world deployment.

Abstract

Given a swarm of limited-capability robots, we seek to automatically discover the set of possible emergent behaviors. Prior approaches to behavior discovery rely on human feedback or hand-crafted behavior metrics to represent and evolve behaviors and only discover behaviors in simulation, without testing or considering the deployment of these new behaviors on real robot swarms. In this work, we present Real2Sim2Real Behavior Discovery via Self-Supervised Representation Learning, which combines representation learning and novelty search to discover possible emergent behaviors automatically in simulation and enable direct controller transfer to real robots. First, we evaluate our method in simulation and show that our proposed self-supervised representation learning approach outperforms previous hand-crafted metrics by more accurately representing the space of possible emergent behaviors. Then, we address the reality gap by incorporating recent work in sim2real transfer for swarms into our lightweight simulator design, enabling direct robot deployment of all behaviors discovered in simulation on an open-source and low-cost robot platform.

Paper Structure

This paper contains 26 sections, 4 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Real2Sim2Real Behavior Discovery via Self-Supervised Representation Learning discovers new collective behaviors for robot swarms while addressing the Sim2Real gap. 1) Real robot measurements are implemented into a physics model in software. 2) The model is used to generate thousands of randomly sampled behavior videos, which are used to train a representation encoder using self-supervised learning. 3) The trained encoder is used to discover novel emergent behaviors. 4) Interesting behaviors found in simulation can be deployed on real robots without the need for controller adjustment.
  • Figure 2: HeRo+ Robots: We deploy newly discovered behaviors on a fleet of HeRo+ robots. (a) A single HeRo+ robot uses unicycle commands to locomote and time-of-flight sensing to detect other robots. Our open-source robot design is mostly 3D-printed and costs approximately $80-USD, making it an extremely low-cost option for swarms research. (b) Our swarm of 11 HeRo+ Robots, 8 of which are used in this study.
  • Figure 3: Discovered Behaviors. Behaviors a-c were automatically discovered and deployed on HeRo+ Robots using Real2Sim2Real (RSRS) Representation Learning for Behavior Discovery. Behaviors d-e were discovered in simulation for a simulator without RSRS improvements, but could not be deployed to the real world and were not discovered when RSRS was added. (a)Dispersal: Agents cover the environment by moving away from other robots. (b)Cyclic Pursuit: Agents form a circular chain and revolve around the center. (c)Aggregation: Agents cluster together in the middle of the environment. (d) Milling: All agents revolve around the centroid of the swarm, but they do not form a perfect circle like cyclic pursuit. (e)Wall-Following: Agents slide along the walls and trace the outer edge of the environment.
  • Figure 4: Representation Confusion Matrices for (left) the baseline hand-crafted representations and (right) the self-supervised representations on a held-out set of 500 labeled test videos. Classes along the horizontal axis indicate the labeled class of an anchor behavior and the values along the vertical axis display the frequency with which a behavior from the anchor's class (on-diagonal) was embedded closer to the anchor when compared with a behavior from one of the other classes (off-diagonal). Larger diagonal values indicate stronger within-class correlation in the embedded representation space.
  • Figure 5: t-SNE Embeddings for the (a) baseline hand-crafted representations and the (b) self-supervised representations on a held-out set of 500 labeled test videos. Qualitatively, the hand-crafted baseline is able to densely associate cyclic pursuit behaviors, but fails to differentiate dispersal from random behaviors. In our approach, behavior features have more variance, but are more closely associated with behaviors from the same class, notably dispersal, which is embedded more distinctly from random behaviors than in the baseline case.