Table of Contents
Fetching ...

Seeing Through Pixel Motion: Learning Obstacle Avoidance from Optical Flow with One Camera

Yu Hu, Yuang Zhang, Yunlong Song, Yang Deng, Feng Yu, Linzuo Zhang, Weiyao Lin, Danping Zou, Wenxian Yu

TL;DR

This paper tackles obstacle avoidance for quadrotors using monocular optical flow, addressing the limitations of depth-based sensing and flow ambiguity. It introduces an end-to-end framework that maps optical flow to control via a differentiable simulator, augmented by central flow attention and action-guided active sensing, enabling agile flight up to $6$ m/s. A GPU-based differentiable simulator supports Backpropagation Through Time, allowing end-to-end policy optimization and zero-shot sim-to-real transfer, demonstrated on real FPV hardware. Results show robust high-speed navigation in unknown cluttered environments, while identifying remaining gaps due to optical-flow noise and rotational effects near the FoE that limit performance relative to depth-based approaches.

Abstract

Optical flow captures the motion of pixels in an image sequence over time, providing information about movement, depth, and environmental structure. Flying insects utilize this information to navigate and avoid obstacles, allowing them to execute highly agile maneuvers even in complex environments. Despite its potential, autonomous flying robots have yet to fully leverage this motion information to achieve comparable levels of agility and robustness. Challenges of control from optical flow include extracting accurate optical flow at high speeds, handling noisy estimation, and ensuring robust performance in complex environments. To address these challenges, we propose a novel end-to-end system for quadrotor obstacle avoidance using monocular optical flow. We develop an efficient differentiable simulator coupled with a simplified quadrotor model, allowing our policy to be trained directly through first-order gradient optimization. Additionally, we introduce a central flow attention mechanism and an action-guided active sensing strategy that enhances the policy's focus on task-relevant optical flow observations to enable more responsive decision-making during flight. Our system is validated both in simulation and the real world using an FPV racing drone. Despite being trained in a simple environment in simulation, our system is validated both in simulation and the real world using an FPV racing drone. Despite being trained in a simple environment in simulation, our system demonstrates agile and robust flight in various unknown, cluttered environments in the real world at speeds of up to 6m/s.

Seeing Through Pixel Motion: Learning Obstacle Avoidance from Optical Flow with One Camera

TL;DR

This paper tackles obstacle avoidance for quadrotors using monocular optical flow, addressing the limitations of depth-based sensing and flow ambiguity. It introduces an end-to-end framework that maps optical flow to control via a differentiable simulator, augmented by central flow attention and action-guided active sensing, enabling agile flight up to m/s. A GPU-based differentiable simulator supports Backpropagation Through Time, allowing end-to-end policy optimization and zero-shot sim-to-real transfer, demonstrated on real FPV hardware. Results show robust high-speed navigation in unknown cluttered environments, while identifying remaining gaps due to optical-flow noise and rotational effects near the FoE that limit performance relative to depth-based approaches.

Abstract

Optical flow captures the motion of pixels in an image sequence over time, providing information about movement, depth, and environmental structure. Flying insects utilize this information to navigate and avoid obstacles, allowing them to execute highly agile maneuvers even in complex environments. Despite its potential, autonomous flying robots have yet to fully leverage this motion information to achieve comparable levels of agility and robustness. Challenges of control from optical flow include extracting accurate optical flow at high speeds, handling noisy estimation, and ensuring robust performance in complex environments. To address these challenges, we propose a novel end-to-end system for quadrotor obstacle avoidance using monocular optical flow. We develop an efficient differentiable simulator coupled with a simplified quadrotor model, allowing our policy to be trained directly through first-order gradient optimization. Additionally, we introduce a central flow attention mechanism and an action-guided active sensing strategy that enhances the policy's focus on task-relevant optical flow observations to enable more responsive decision-making during flight. Our system is validated both in simulation and the real world using an FPV racing drone. Despite being trained in a simple environment in simulation, our system is validated both in simulation and the real world using an FPV racing drone. Despite being trained in a simple environment in simulation, our system demonstrates agile and robust flight in various unknown, cluttered environments in the real world at speeds of up to 6m/s.

Paper Structure

This paper contains 18 sections, 6 equations, 10 figures.

Figures (10)

  • Figure 1: Our drone autonomously navigates in the cluttered environment using optical flow estimated from a single camera. (a) Overview of the testing environment and trajectory. (b) Our real-world offboard control system is equipped with an FPV camera, a wireless video transmitter, and a wireless data transmitter. (c) FPV images and optical flow estimations recorded during the flight.
  • Figure 2: System overview. We train our neural network policy using a differentiable simulator, which enables simulating quadrotor physics, rendering ground-truth optical flow, and calculating analytic policy gradients. We deploy our policy using a real FPV-style quadrotor in the real world. The reference acceleration output by the flight policy is sent into the inner-loop controller.
  • Figure 3: Our training environment vs Real-world testing environments. (a) Our training environment features objects with simple geometric shapes. (b) Sampled ground-truth optical flow for training. (c-d) Real-world testing environments. (e) Estimated optical flow in the real world. Flow-based representation captures essential information about motion and removes redundant information that might be irrelevant for obstacle avoidance.
  • Figure 4: Challenges in detecting obstacles from optical flow. Left: During rotation (b), the flow of obstacles can vanish, making it appear similar to the background flow in the pure translation scenario (a). Right: Flow values near the Focus of Expansion (FoE) are minimal, making it difficult to detect looming effects using local flow divergence with noisy estimation.
  • Figure 5: We combine a central flow attention mechanism (left) with active sensing (right) to extract task-relevant flow information while maintaining efficient computation.
  • ...and 5 more figures