VisFly: An Efficient and Versatile Simulator for Training Vision-based Flight

Fanxing Li; Fangyu Sun; Tianbao Zhang; Danping Zou

VisFly: An Efficient and Versatile Simulator for Training Vision-based Flight

Fanxing Li, Fangyu Sun, Tianbao Zhang, Danping Zou

TL;DR

VisFly, a quadrotor simulator designed to efficiently train vision-based flight policies using reinforcement learning algorithms, is presented, enabling training on diverse real-world environments simultaneously.

Abstract

We present VisFly, a quadrotor simulator designed to efficiently train vision-based flight policies using reinforcement learning algorithms. VisFly offers a user-friendly framework and interfaces, leveraging Habitat-Sim's rendering engines to achieve frame rates exceeding 10,000 frames per second for rendering motion and sensor data. The simulator incorporates differentiable physics and is seamlessly wrapped with the Gym environment, facilitating the straightforward implementation of various learning algorithms. It supports the directly importing open-source scene datasets compatible with Habitat-Sim, enabling training on diverse real-world environments simultaneously. To validate our simulator, we also make three reinforcement learning examples for typical flight tasks relying on visual observations. The simulator is now available at [https://github.com/SJTU-ViSYS-team/VisFly].

VisFly: An Efficient and Versatile Simulator for Training Vision-based Flight

TL;DR

Abstract

Paper Structure (14 sections, 4 equations, 7 figures, 2 tables)

This paper contains 14 sections, 4 equations, 7 figures, 2 tables.

Introduction
Related Work
Full-stack simulators
Learning-specialized simulators
Methodology
Differentiable dynamics based on PyTorch
Efficient rendering and open-source scene management
Modular design for Gym integration
Domain randomization for sensor data and flight states
Training examples
Learning to navigate in a cluttered environment
Learning to cross a narrow gap cooperatively
Learning to land
Conclusion And Discussion

Figures (7)

Figure 1: Overall diagram of VisFly. Our system uses differentiable physics to drive single or multiple quadrotor agents through four different types of controllers and employs Habitat-sim’s rendering engine for fast rendering and access to open-source datasets. All of these components are integrated into Gym environments, providing standard interfaces for various learning algorithms.
Figure 2: Response Curves of Position, Orientation, Linear Velocity, and Angular Velocity to Step Signals of Left:position commands and Right:linear velocity commands. Different colors denote various initial states.
Figure 3: RGB, depth, and semantic view of agent in open-source and customized scenes. The right sub-image illustrates a swarm of 100 quadrotors in a clear garage.
Figure 4: Frame rate performance of VisFly. Left: Tested with 100 agents in the Replica dataset, achieving up to 10,000 FPS. At 256$\times$256 resolution, the frame rate still reaches up to 6,000 FPS. Right: Variation in frame rate with the number of scenes running simultaneously at 64$\times$64 resolution.
Figure 5: Neural network architecture used across different tasks. The image feature extractor has been modified to be more customizable, including backbones such as ResNethe2016deep, MobileNetsandler2018mobilenetv2, EfficientNettan2019efficientnet. The policies for the three tasks follow this architecture but are slightly different. VisFly additionally incorporates a recurrent network interface chung2014empirical. Detailed policy setting is introduced in VisFly homepage.
...and 2 more figures

VisFly: An Efficient and Versatile Simulator for Training Vision-based Flight

TL;DR

Abstract

VisFly: An Efficient and Versatile Simulator for Training Vision-based Flight

Authors

TL;DR

Abstract

Table of Contents

Figures (7)