Whole-Body Control Through Narrow Gaps From Pixels To Action

Tianyue Wu; Yeke Chen; Tianyang Chen; Guangyu Zhao; Fei Gao

Whole-Body Control Through Narrow Gaps From Pixels To Action

Tianyue Wu, Yeke Chen, Tianyang Chen, Guangyu Zhao, Fei Gao

TL;DR

A purely data-driven method to master this flight skill in simulation, where a neural network directly maps pixels and proprioception to continuous low-level control commands, which enables wholebody control through gaps with different geometries demanding sharp attitude changes.

Abstract

Flying through body-size narrow gaps in the environment is one of the most challenging moments for an underactuated multirotor. We explore a purely data-driven method to master this flight skill in simulation, where a neural network directly maps pixels and proprioception to continuous low-level control commands. This learned policy enables whole-body control through gaps with different geometries demanding sharp attitude changes (e.g., near-vertical roll angle). The policy is achieved by successive model-free reinforcement learning (RL) and online observation space distillation. The RL policy receives (virtual) point clouds of the gaps' edges for scalable simulation and is then distilled into the high-dimensional pixel space. However, this flight skill is fundamentally expensive to learn by exploring due to restricted feasible solution space. We propose to reset the agent as states on the trajectories by a model-based trajectory optimizer to alleviate this problem. The presented training pipeline is compared with baseline methods, and ablation studies are conducted to identify the key ingredients of our method. The immediate next step is to scale up the variation of gap sizes and geometries in anticipation of emergent policies and demonstrate the sim-to-real transformation.

Whole-Body Control Through Narrow Gaps From Pixels To Action

TL;DR

Abstract

Paper Structure (21 sections, 1 equation, 9 figures, 2 tables)

This paper contains 21 sections, 1 equation, 9 figures, 2 tables.

Introduction
Related Work
End-to-end Policy Learning from Pixels for Mobile Robots
Flight Through Gaps with Underactuated Multirotors
Problem Statement: Whole-Body Control Through a Gap
Method: Learning Whole-Body Control From Pixels to Action
Online Reinforcement Learning from Gap Points
Observation and Action Space
Reward Function
Termination Condition
Policy Representation
Informed Reset
Online Distillation via Supervised Learning
Evaluation
Implementation and Setup
...and 6 more sections

Figures (9)

Figure 1: The snapshots of the quadrotor driven by the pixel-based policy in simulation. We choose a wall with a hole (i.e., the example in Fig. \ref{['fig:method']}) and two trees' trunks as masks (Sec. \ref{['sec:distillation']}), respectively, for the above contexts.
Figure 1: Geometry and sizes of different gaps expressed in 2D vertices coordinates.
Figure 2: Illustration of the problem of whole-body control through a gap.
Figure 3: The policy learning architecture employed in this paper.
Figure 4: Examples of gap points.
...and 4 more figures

Whole-Body Control Through Narrow Gaps From Pixels To Action

TL;DR

Abstract

Whole-Body Control Through Narrow Gaps From Pixels To Action

Authors

TL;DR

Abstract

Table of Contents

Figures (9)