Table of Contents
Fetching ...

A Survey of Offline and Online Learning-Based Algorithms for Multirotor UAVs

Serhat Sönmez, Matthew J. Rutherford, Kimon P. Valavanis

TL;DR

This survey addresses the problem of robust multirotor UAV navigation and control under uncertainty by surveying both offline and online learning-based algorithms across machine learning, deep learning, and reinforcement learning. It emphasizes online learning as a pathway to real-time adaptation, detailing a taxonomy of methods (value-function-based, policy-search-based, and actor-critic) and cataloging notable examples (e.g., Q-learning/DQN, PILCO, PPO, TD3, SAC, LSPI, DRL variants) along with vision-based DL controllers such as DroNet, ADRNet, and TrailNet. The paper provides four comprehensive tables summarizing publications, tasks, and learning targets, and discusses practical considerations for hard-real-time implementability. By consolidating state-of-the-art approaches and their real-time feasibility, it offers a practical roadmap for researchers and practitioners to select learn-based UAV controllers suitable for specific missions and hardware constraints. The findings underscore online learning’s potential to enable anytime adaptation and highlight gaps and opportunities for future work in real-time, data-driven UAV control systems.

Abstract

Multirotor UAVs are used for a wide spectrum of civilian and public domain applications. Navigation controllers endowed with different attributes and onboard sensor suites enable multirotor autonomous or semi-autonomous, safe flight, operation, and functionality under nominal and detrimental conditions and external disturbances, even when flying in uncertain and dynamically changing environments. During the last decade, given the faster-than-exponential increase of available computational power, different learning-based algorithms have been derived, implemented, and tested to navigate and control, among other systems, multirotor UAVs. Learning algorithms have been, and are used to derive data-driven based models, to identify parameters, to track objects, to develop navigation controllers, and to learn the environment in which multirotors operate. Learning algorithms combined with model-based control techniques have been proven beneficial when applied to multirotors. This survey summarizes published research since 2015, dividing algorithms, techniques, and methodologies into offline and online learning categories, and then, further classifying them into machine learning, deep learning, and reinforcement learning sub-categories. An integral part and focus of this survey are on online learning algorithms as applied to multirotors with the aim to register the type of learning techniques that are either hard or almost hard real-time implementable, as well as to understand what information is learned, why, and how, and how fast. The outcome of the survey offers a clear understanding of the recent state-of-the-art and of the type and kind of learning-based algorithms that may be implemented, tested, and executed in real-time.

A Survey of Offline and Online Learning-Based Algorithms for Multirotor UAVs

TL;DR

This survey addresses the problem of robust multirotor UAV navigation and control under uncertainty by surveying both offline and online learning-based algorithms across machine learning, deep learning, and reinforcement learning. It emphasizes online learning as a pathway to real-time adaptation, detailing a taxonomy of methods (value-function-based, policy-search-based, and actor-critic) and cataloging notable examples (e.g., Q-learning/DQN, PILCO, PPO, TD3, SAC, LSPI, DRL variants) along with vision-based DL controllers such as DroNet, ADRNet, and TrailNet. The paper provides four comprehensive tables summarizing publications, tasks, and learning targets, and discusses practical considerations for hard-real-time implementability. By consolidating state-of-the-art approaches and their real-time feasibility, it offers a practical roadmap for researchers and practitioners to select learn-based UAV controllers suitable for specific missions and hardware constraints. The findings underscore online learning’s potential to enable anytime adaptation and highlight gaps and opportunities for future work in real-time, data-driven UAV control systems.

Abstract

Multirotor UAVs are used for a wide spectrum of civilian and public domain applications. Navigation controllers endowed with different attributes and onboard sensor suites enable multirotor autonomous or semi-autonomous, safe flight, operation, and functionality under nominal and detrimental conditions and external disturbances, even when flying in uncertain and dynamically changing environments. During the last decade, given the faster-than-exponential increase of available computational power, different learning-based algorithms have been derived, implemented, and tested to navigate and control, among other systems, multirotor UAVs. Learning algorithms have been, and are used to derive data-driven based models, to identify parameters, to track objects, to develop navigation controllers, and to learn the environment in which multirotors operate. Learning algorithms combined with model-based control techniques have been proven beneficial when applied to multirotors. This survey summarizes published research since 2015, dividing algorithms, techniques, and methodologies into offline and online learning categories, and then, further classifying them into machine learning, deep learning, and reinforcement learning sub-categories. An integral part and focus of this survey are on online learning algorithms as applied to multirotors with the aim to register the type of learning techniques that are either hard or almost hard real-time implementable, as well as to understand what information is learned, why, and how, and how fast. The outcome of the survey offers a clear understanding of the recent state-of-the-art and of the type and kind of learning-based algorithms that may be implemented, tested, and executed in real-time.
Paper Structure (12 sections, 2 equations, 6 figures, 4 tables)

This paper contains 12 sections, 2 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Offline reinforcement learning block diagram illustration levine2020offline.
  • Figure 2: Online reinforcement learning block diagram illustration levine2020offline.
  • Figure 3: Off-policy RL algorithm configuration levine2020offline.
  • Figure 4: Publications of online and offline learning algorithms for control of multirotor UAVs since 2015 based on Google Scholar search.
  • Figure 5: Block diagram and interaction between agent and environment sutton2018reinforcement.
  • ...and 1 more figures