Table of Contents
Fetching ...

UAS Visual Navigation in Large and Unseen Environments via a Meta Agent

Yuci Han, Charles Toth, Alper Yilmaz

TL;DR

This work addresses long-range, monocular-vision UAS navigation in large urban environments and the challenge of transferring learned policies to unseen areas. It introduces a two-stage meta-curriculum framework that first meta-trains a master policy over multiple tasks and then fine-tunes it through a hierarchical coarse-to-fine curriculum, complemented by Incremental Self-Adaptive Reinforcement Learning (ISAR) to speed up learning for long-horizon tasks. ISAR combines inner-episode interaction loss with an adaptive loss to perform incremental policy updates across short trajectory windows, compatible with base RL algorithms such as A3C or PPO. Empirical results in the AirSim simulator show faster convergence and robust transfer to unseen environments, indicating significant reductions in training cost and improved adaptability for real-world urban navigation tasks.

Abstract

The aim of this work is to develop an approach that enables Unmanned Aerial System (UAS) to efficiently learn to navigate in large-scale urban environments and transfer their acquired expertise to novel environments. To achieve this, we propose a meta-curriculum training scheme. First, meta-training allows the agent to learn a master policy to generalize across tasks. The resulting model is then fine-tuned on the downstream tasks. We organize the training curriculum in a hierarchical manner such that the agent is guided from coarse to fine towards the target task. In addition, we introduce Incremental Self-Adaptive Reinforcement learning (ISAR), an algorithm that combines the ideas of incremental learning and meta-reinforcement learning (MRL). In contrast to traditional reinforcement learning (RL), which focuses on acquiring a policy for a specific task, MRL aims to learn a policy with fast transfer ability to novel tasks. However, the MRL training process is time consuming, whereas our proposed ISAR algorithm achieves faster convergence than the conventional MRL algorithm. We evaluate the proposed methodologies in simulated environments and demonstrate that using this training philosophy in conjunction with the ISAR algorithm significantly improves the convergence speed for navigation in large-scale cities and the adaptation proficiency in novel environments.

UAS Visual Navigation in Large and Unseen Environments via a Meta Agent

TL;DR

This work addresses long-range, monocular-vision UAS navigation in large urban environments and the challenge of transferring learned policies to unseen areas. It introduces a two-stage meta-curriculum framework that first meta-trains a master policy over multiple tasks and then fine-tunes it through a hierarchical coarse-to-fine curriculum, complemented by Incremental Self-Adaptive Reinforcement Learning (ISAR) to speed up learning for long-horizon tasks. ISAR combines inner-episode interaction loss with an adaptive loss to perform incremental policy updates across short trajectory windows, compatible with base RL algorithms such as A3C or PPO. Empirical results in the AirSim simulator show faster convergence and robust transfer to unseen environments, indicating significant reductions in training cost and improved adaptability for real-world urban navigation tasks.

Abstract

The aim of this work is to develop an approach that enables Unmanned Aerial System (UAS) to efficiently learn to navigate in large-scale urban environments and transfer their acquired expertise to novel environments. To achieve this, we propose a meta-curriculum training scheme. First, meta-training allows the agent to learn a master policy to generalize across tasks. The resulting model is then fine-tuned on the downstream tasks. We organize the training curriculum in a hierarchical manner such that the agent is guided from coarse to fine towards the target task. In addition, we introduce Incremental Self-Adaptive Reinforcement learning (ISAR), an algorithm that combines the ideas of incremental learning and meta-reinforcement learning (MRL). In contrast to traditional reinforcement learning (RL), which focuses on acquiring a policy for a specific task, MRL aims to learn a policy with fast transfer ability to novel tasks. However, the MRL training process is time consuming, whereas our proposed ISAR algorithm achieves faster convergence than the conventional MRL algorithm. We evaluate the proposed methodologies in simulated environments and demonstrate that using this training philosophy in conjunction with the ISAR algorithm significantly improves the convergence speed for navigation in large-scale cities and the adaptation proficiency in novel environments.

Paper Structure

This paper contains 13 sections, 4 equations, 10 figures, 1 algorithm.

Figures (10)

  • Figure 1: Overview of learning framework. The navigation task consists of two phases: meta-training and curriculum fine-tuning. Meta-training allows the agent to learn a master navigation policy. The hierarchical-structured curriculum adapts the meta-policy to the target task. This meta-policy can further be transferred to novel environments.
  • Figure 2: The illustration of AirSim simulated urban environment.
  • Figure 3: The illustration of meta-training process. $\mathcal{L}_{T}^i$ is the loss for the $i^{th}$ meta-task. The meta-loss $\mathcal{L}_{meta}$ is the sum of $\mathcal{L}_{T}^i$, which is used to update the meta-agent $\psi$.
  • Figure 4: The illustration of the obstacles distribution and the observation of the same location at altitudes of $75m$, $45m$ and $15m$. The agent starts with the meta-policy and learns hierarchical policies from coarse-to-fine.
  • Figure 5: The illustration of the ISAR framework with adaptive-update step size = $2$. During exploration, we learn two types of losses: the interaction loss $\mathcal{L}_{int}$ to update the interaction policy $f_\theta$, and the adaptive loss $\mathcal{L}_{adapt}$ based on trajectory segment with length $N$ to update the adaptation policy $f_\phi$.
  • ...and 5 more figures