Flying in Clutter on Monocular RGB by Learning in 3D Radiance Fields with Domain Adaptation

Xijie Huang; Jinhan Li; Tianyue Wu; Xin Zhou; Zhichao Han; Fei Gao

Flying in Clutter on Monocular RGB by Learning in 3D Radiance Fields with Domain Adaptation

Xijie Huang, Jinhan Li, Tianyue Wu, Xin Zhou, Zhichao Han, Fei Gao

TL;DR

This work tackles autonomous UAV navigation in clutter using only monocular RGB input by learning policies in photorealistic 3D Gaussian Splatting (3DGS) environments and bridging the sim-to-real gap with adversarial domain adaptation and domain randomization. It introduces an end-to-end RGB-based RL framework with an actor-critic architecture and a depth-privileged critic, paired with accelerated 3DGS rendering via pruning. The method demonstrates zero-shot transfer to real-world flights under varying obstacle layouts and illumination, supported by ablations and latent-space analyses that clarify the roles of DA and DR in reducing domain shift. The results indicate a practical pathway for monocular RGB navigation on lightweight UAVs and point toward scaling 3DGS-based training to diverse, large-scale datasets and ecosystem-level deployment.

Abstract

Modern autonomous navigation systems predominantly rely on lidar and depth cameras. However, a fundamental question remains: Can flying robots navigate in clutter using solely monocular RGB images? Given the prohibitive costs of real-world data collection, learning policies in simulation offers a promising path. Yet, deploying such policies directly in the physical world is hindered by the significant sim-to-real perception gap. Thus, we propose a framework that couples the photorealism of 3D Gaussian Splatting (3DGS) environments with Adversarial Domain Adaptation. By training in high-fidelity simulation while explicitly minimizing feature discrepancy, our method ensures the policy relies on domain-invariant cues. Experimental results demonstrate that our policy achieves robust zero-shot transfer to the physical world, enabling safe and agile flight in unstructured environments with varying illumination.

Flying in Clutter on Monocular RGB by Learning in 3D Radiance Fields with Domain Adaptation

TL;DR

Abstract

Flying in Clutter on Monocular RGB by Learning in 3D Radiance Fields with Domain Adaptation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)