Table of Contents
Fetching ...

Deep Learning-based Robust Autonomous Navigation of Aerial Robots in Dense Forests

Guglielmo Del Col, Väinö Karjalainen, Teemu Hakala, Yibo Zhang, Eija Honkavaara

TL;DR

The paper tackles autonomous UAV navigation under GNSS-denied, dense forest conditions where perception is degraded and obstacles are thin or irregular. It introduces DeFoP, a vision-based system that fuses a semantically enhanced depth encoder, a neural Collision Prediction Network, a geometric safety supervisor, and depth refinement, all running onboard with TensorRT-accelerated inference. Real-world tests across three forest densities show DeFoP achieving 100% success in Difficult and Medium forests and 80% in Very Difficult forests, outperforming prior SEVAE-based methods and competing baselines. The work highlights the importance of safety-aware planning and depth-quality improvements for practical under-canopy autonomy and suggests avenues for multimodal perception and depth completion to further enhance robustness.

Abstract

Autonomous aerial navigation in dense natural environments remains challenging due to limited visibility, thin and irregular obstacles, GNSS-denied operation, and frequent perceptual degradation. This work presents an improved deep learning-based navigation framework that integrates semantically enhanced depth encoding with neural motion-primitive evaluation for robust flight in cluttered forests. Several modules are incorporated on top of the original sevae-ORACLE algorithm to address limitations observed during real-world deployment, including lateral control for sharper maneuvering, a temporal consistency mechanism to suppress oscillatory planning decisions, a stereo-based visual-inertial odometry solution for drift-resilient state estimation, and a supervisory safety layer that filters unsafe actions in real time. A depth refinement stage is included to improve the representation of thin branches and reduce stereo noise, while GPU optimization increases onboard inference throughput from 4 Hz to 10 Hz. The proposed approach is evaluated against several existing learning-based navigation methods under identical environmental conditions and hardware constraints. It demonstrates higher success rates, more stable trajectories, and improved collision avoidance, particularly in highly cluttered forest settings. The system is deployed on a custom quadrotor in three boreal forest environments, achieving fully autonomous completion in all flights in moderate and dense clutter, and 12 out of 15 flights in highly dense underbrush. These results demonstrate improved reliability and safety over existing navigation methods in complex natural environments.

Deep Learning-based Robust Autonomous Navigation of Aerial Robots in Dense Forests

TL;DR

The paper tackles autonomous UAV navigation under GNSS-denied, dense forest conditions where perception is degraded and obstacles are thin or irregular. It introduces DeFoP, a vision-based system that fuses a semantically enhanced depth encoder, a neural Collision Prediction Network, a geometric safety supervisor, and depth refinement, all running onboard with TensorRT-accelerated inference. Real-world tests across three forest densities show DeFoP achieving 100% success in Difficult and Medium forests and 80% in Very Difficult forests, outperforming prior SEVAE-based methods and competing baselines. The work highlights the importance of safety-aware planning and depth-quality improvements for practical under-canopy autonomy and suggests avenues for multimodal perception and depth completion to further enhance robustness.

Abstract

Autonomous aerial navigation in dense natural environments remains challenging due to limited visibility, thin and irregular obstacles, GNSS-denied operation, and frequent perceptual degradation. This work presents an improved deep learning-based navigation framework that integrates semantically enhanced depth encoding with neural motion-primitive evaluation for robust flight in cluttered forests. Several modules are incorporated on top of the original sevae-ORACLE algorithm to address limitations observed during real-world deployment, including lateral control for sharper maneuvering, a temporal consistency mechanism to suppress oscillatory planning decisions, a stereo-based visual-inertial odometry solution for drift-resilient state estimation, and a supervisory safety layer that filters unsafe actions in real time. A depth refinement stage is included to improve the representation of thin branches and reduce stereo noise, while GPU optimization increases onboard inference throughput from 4 Hz to 10 Hz. The proposed approach is evaluated against several existing learning-based navigation methods under identical environmental conditions and hardware constraints. It demonstrates higher success rates, more stable trajectories, and improved collision avoidance, particularly in highly cluttered forest settings. The system is deployed on a custom quadrotor in three boreal forest environments, achieving fully autonomous completion in all flights in moderate and dense clutter, and 12 out of 15 flights in highly dense underbrush. These results demonstrate improved reliability and safety over existing navigation methods in complex natural environments.

Paper Structure

This paper contains 31 sections, 5 equations, 7 figures, 8 tables.

Figures (7)

  • Figure 1: Navigation and planning pipeline.
  • Figure 2: Example illustrating the effect of depth-map refinement. (a) Raw depth image acquired from the Intel RealSense D435i. Colors encode the distance to obstacles using a brown-scale colormap, where darker tones indicate closer objects. Black pixels correspond to undefined depth measurements, which occur predominantly along object boundaries and thin structures, impairing accurate estimation of trunk thickness and branch presence. (b) Refined depth map after applying the proposed depth improver. Undefined pixels in the vicinity of obstacles are reconstructed, while regions far from obstacles remain undefined. This selective refinement improves obstacle representation while avoiding the introduction of spurious structures, thereby supporting safer trajectory selection.
  • Figure 3: Example of autoencoder reconstruction. (a) Raw depth image. (b) Reconstructed depth image obtained after processing through the encoder–decoder convolutional layers. Notably, the semantically enhanced autoencoder is able to reconstruct the thin branch on the right side of the image almost completely, despite it being represented by only a small number of pixels in the input. This behavior is not typically observed in conventional autoencoders and highlights the effectiveness of the proposed semantic reconstruction strategy Kulkarni2023Semantically-enhancedRobots.
  • Figure 4: Quadrotor Hardware Components.
  • Figure 5: Examples of vegetation in the three real-world environments.
  • ...and 2 more figures