Table of Contents
Fetching ...

Solving Multi-Entity Robotic Problems Using Permutation Invariant Neural Networks

Tianxu An, Joonho Lee, Marko Bjelonic, Flavio De Vincenti, Marco Hutter

TL;DR

The paper tackles the challenge of coordinating multiple robots with variable and unknown entity counts in real-world environments. It introduces a decentralized MARL framework using permutation invariant encoders (GEEs) to process heterogeneous entity sets, enabling zero-shot generalization to new numbers of robots, goals, and objects. The approach is validated through three tasks—MRMG Navigation, Box Packing, and Soccer—showing scalable collaboration, dynamic entity prioritization, and competitive performance against an MPC baseline while maintaining constant inference time. Real-world MRMG experiments demonstrate collision avoidance and adaptive task distribution, underscoring the practical impact of permutation-invariant encoders for scalable, heuristics-free multi-robot control.

Abstract

Challenges in real-world robotic applications often stem from managing multiple, dynamically varying entities such as neighboring robots, manipulable objects, and navigation goals. Existing multi-agent control strategies face scalability limitations, struggling to handle arbitrary numbers of entities. Additionally, they often rely on engineered heuristics for assigning entities among agents. We propose a data driven approach to address these limitations by introducing a decentralized control system using neural network policies trained in simulation. Leveraging permutation invariant neural network architectures and model-free reinforcement learning, our approach allows control agents to autonomously determine the relative importance of different entities without being biased by ordering or limited by a fixed capacity. We validate our approach through both simulations and real-world experiments involving multiple wheeled-legged quadrupedal robots, demonstrating their collaborative control capabilities. We prove the effectiveness of our architectural choice through experiments with three exemplary multi-entity problems. Our analysis underscores the pivotal role of the end-to-end trained permutation invariant encoders in achieving scalability and improving the task performance in multi-object manipulation or multi-goal navigation problems. The adaptability of our policy is further evidenced by its ability to manage varying numbers of entities in a zero-shot manner, showcasing near-optimal autonomous task distribution and collision avoidance behaviors.

Solving Multi-Entity Robotic Problems Using Permutation Invariant Neural Networks

TL;DR

The paper tackles the challenge of coordinating multiple robots with variable and unknown entity counts in real-world environments. It introduces a decentralized MARL framework using permutation invariant encoders (GEEs) to process heterogeneous entity sets, enabling zero-shot generalization to new numbers of robots, goals, and objects. The approach is validated through three tasks—MRMG Navigation, Box Packing, and Soccer—showing scalable collaboration, dynamic entity prioritization, and competitive performance against an MPC baseline while maintaining constant inference time. Real-world MRMG experiments demonstrate collision avoidance and adaptive task distribution, underscoring the practical impact of permutation-invariant encoders for scalable, heuristics-free multi-robot control.

Abstract

Challenges in real-world robotic applications often stem from managing multiple, dynamically varying entities such as neighboring robots, manipulable objects, and navigation goals. Existing multi-agent control strategies face scalability limitations, struggling to handle arbitrary numbers of entities. Additionally, they often rely on engineered heuristics for assigning entities among agents. We propose a data driven approach to address these limitations by introducing a decentralized control system using neural network policies trained in simulation. Leveraging permutation invariant neural network architectures and model-free reinforcement learning, our approach allows control agents to autonomously determine the relative importance of different entities without being biased by ordering or limited by a fixed capacity. We validate our approach through both simulations and real-world experiments involving multiple wheeled-legged quadrupedal robots, demonstrating their collaborative control capabilities. We prove the effectiveness of our architectural choice through experiments with three exemplary multi-entity problems. Our analysis underscores the pivotal role of the end-to-end trained permutation invariant encoders in achieving scalability and improving the task performance in multi-object manipulation or multi-goal navigation problems. The adaptability of our policy is further evidenced by its ability to manage varying numbers of entities in a zero-shot manner, showcasing near-optimal autonomous task distribution and collision avoidance behaviors.
Paper Structure (34 sections, 6 equations, 9 figures, 4 tables)

This paper contains 34 sections, 6 equations, 9 figures, 4 tables.

Figures (9)

  • Figure 1: Multi-entity problems studied in this work. (A) Robot 1 is given multiple goals to visit while interacting with other robots. (B) Multiple robots packing multiple boxes into the goal region. (C) Soccer.
  • Figure 2: Pipeline overview for a decentralized ego robot. (A) The ego robot and its environments: robot can observe entities such as neighbors, goals, objects, etc. (A-1) MRMG navigation environments. (A-2) Box packing environment. (A-3) Soccer environment. (B) All entities belonging to one category are passed to one GEE. Their entity states are first passed into individual weight sharing encoders to get local entity features. The local entity features are then max-pooled to obtain the global entity feature for this entity category. (C) All global entity features belonging to different entity categories are concatenated to form the universal entity feature, which is then concatenated with the ego robot's proprioceptive & exteroceptive observations from the environment to form the input to the high-level policy. (D) The high-level policy is an RL policy network that outputs high-level actions for the ego robot. The high-level actions can be the target body position or body velocity, depending on required task settings. (E) The low-level policy takes high-level actions and outputs joint torques to control the ego robot. The low-level policy is pretrained based on the work by lee2022control and fixed when training high-level policies.
  • Figure 3: Robot experiments. (A) MRMG navigation with two robots and three goals. (B, C) Dead-end experiment. (D) Single-robot Multi-goal navigation.
  • Figure 4: Saliency maps of two robots during the box packing task. The four robots have to move the boxes into the goal point. (A) At the beginning, robot G does not focus on any box but attends to the states of robot R. (B) Then robot G joins robot R to transport the box R together. (C) Once all boxes are at the goal except for box R, all robots shift their focus to box R.
  • Figure 5: Impact of Robot Numbers on Packing tasks. (A) Success Rate (bar columns) and Completion Time (dotted lines) of 1-10 robots packing 1-10 boxes. (B) 1 robot transports a box by walking sideways. (C) 2 robots are faster by squeezing the box in between them. (D) 2 robots can transport more than 1 box at the same time.
  • ...and 4 more figures