Table of Contents
Fetching ...

Distributed NeRF Learning for Collaborative Multi-Robot Perception

Hongrui Zhao, Boris Ivanovic, Negar Mehr

TL;DR

This paper proposes a collaborative multi-agent perception system where agents collectively learn a neural radiance field from posed RGB images to represent a scene, achieving performance comparable to centralized mapping of the environment where data is sent to a central server for processing.

Abstract

Effective environment perception is crucial for enabling downstream robotic applications. Individual robotic agents often face occlusion and limited visibility issues, whereas multi-agent systems can offer a more comprehensive mapping of the environment, quicker coverage, and increased fault tolerance. In this paper, we propose a collaborative multi-agent perception system where agents collectively learn a neural radiance field (NeRF) from posed RGB images to represent a scene. Each agent processes its local sensory data and shares only its learned NeRF model with other agents, reducing communication overhead. Given NeRF's low memory footprint, this approach is well-suited for robotic systems with limited bandwidth, where transmitting all raw data is impractical. Our distributed learning framework ensures consistency across agents' local NeRF models, enabling convergence to a unified scene representation. We show the effectiveness of our method through an extensive set of experiments on datasets containing challenging real-world scenes, achieving performance comparable to centralized mapping of the environment where data is sent to a central server for processing. Additionally, we find that multi-agent learning provides regularization benefits, improving geometric consistency in scenarios with sparse input views. We show that in such scenarios, multi-agent mapping can even outperform centralized training.

Distributed NeRF Learning for Collaborative Multi-Robot Perception

TL;DR

This paper proposes a collaborative multi-agent perception system where agents collectively learn a neural radiance field from posed RGB images to represent a scene, achieving performance comparable to centralized mapping of the environment where data is sent to a central server for processing.

Abstract

Effective environment perception is crucial for enabling downstream robotic applications. Individual robotic agents often face occlusion and limited visibility issues, whereas multi-agent systems can offer a more comprehensive mapping of the environment, quicker coverage, and increased fault tolerance. In this paper, we propose a collaborative multi-agent perception system where agents collectively learn a neural radiance field (NeRF) from posed RGB images to represent a scene. Each agent processes its local sensory data and shares only its learned NeRF model with other agents, reducing communication overhead. Given NeRF's low memory footprint, this approach is well-suited for robotic systems with limited bandwidth, where transmitting all raw data is impractical. Our distributed learning framework ensures consistency across agents' local NeRF models, enabling convergence to a unified scene representation. We show the effectiveness of our method through an extensive set of experiments on datasets containing challenging real-world scenes, achieving performance comparable to centralized mapping of the environment where data is sent to a central server for processing. Additionally, we find that multi-agent learning provides regularization benefits, improving geometric consistency in scenarios with sparse input views. We show that in such scenarios, multi-agent mapping can even outperform centralized training.
Paper Structure (11 sections, 6 equations, 7 figures, 3 tables, 1 algorithm)

This paper contains 11 sections, 6 equations, 7 figures, 3 tables, 1 algorithm.

Figures (7)

  • Figure 1: (a) Three robotic agents explore a large-scale scene, with each agent's trajectory shown in a different color. The mesh, generated from the NeRF model of agent 1 (red), is displayed. Our distributed learning drives the model of agent 1 to reach consensus with the models of agents 2 and 3. The model represents the entire scene, and a complete mesh can be generated from it including regions agent 1 did not visit (the two rooms on the right). (b) In our distributed learning approach, only network weights $\Theta_i$ are shared among the agents. Each agent minimizes the NeRF image reconstruction loss $L^{\text{img}}_i$ with mini-batch $R_i$ sampled from its local data $D_i$ while maintaining consensus with other agents. This enables the distributed learning of a comprehensive scene representation without transferring raw data, making it suitable for multi-agent systems with limited communication bandwidth.
  • Figure 2: Our multi-agent approach reconstructs the complex geometry of indoor scenes with comparable quality to the centralized method. In certain instances, the multi-agent approach captures high-frequency details more effectively than the centralized method, highlighted with black rectangles.
  • Figure 3: Comparison of scene reconstruction quality between centralized training and three-agent training. The multi-agent approach achieves performance comparable to the centralized method, with fewer rendering artifacts ("floaters"). These results highlight that our multi-agent learning method can handle larger scenes effectively and achieve a high level of reconstruction completeness
  • Figure 4: Trajectories of the three agents in the experiment, each shown in a different color, simulating a scenario where a group of drones collaboratively explore an indoor environment.
  • Figure 5: Comparison of scene reconstruction quality between multi-agent training under varying communication frequencies and the centralized method. When agents communicate at every iteration (CoM Freq: 100%), multi-agent training achieves performance comparable to the centralized baseline. As communication frequency decreases, the quality of detailed geometry degrades, but the overall structure of the scene remains intact.
  • ...and 2 more figures