Variational Autoencoders for exteroceptive perception in reinforcement learning-based collision avoidance

Thomas Nakken Larsen; Eirik Runde Barlaug; Adil Rasheed

Variational Autoencoders for exteroceptive perception in reinforcement learning-based collision avoidance

Thomas Nakken Larsen, Eirik Runde Barlaug, Adil Rasheed

TL;DR

This paper addresses autonomous maritime navigation by integrating exteroceptive perception with deep reinforcement learning. It proposes using variational autoencoders to compress high-fidelity LiDAR-style range data into a low-dimensional latent feature that serves as input to a PPO-based DRL agent for simultaneous path following and collision avoidance. The study demonstrates that a shallow, pre-trained VAE encoder, coupled with circular padding in the decoder and β-regularization to avoid posterior collapse, improves path adherence and trajectory efficiency relative to a non-VAE baseline, while maintaining safe collision rates. These findings advance the practical deployment of VAE-augmented DRL in marine control systems by reducing input dimensionality and stabilizing latent representations, enabling robust exteroceptive perception in dynamic environments.

Abstract

Modern control systems are increasingly turning to machine learning algorithms to augment their performance and adaptability. Within this context, Deep Reinforcement Learning (DRL) has emerged as a promising control framework, particularly in the domain of marine transportation. Its potential for autonomous marine applications lies in its ability to seamlessly combine path-following and collision avoidance with an arbitrary number of obstacles. However, current DRL algorithms require disproportionally large computational resources to find near-optimal policies compared to the posed control problem when the searchable parameter space becomes large. To combat this, our work delves into the application of Variational AutoEncoders (VAEs) to acquire a generalized, low-dimensional latent encoding of a high-fidelity range-finding sensor, which serves as the exteroceptive input to a DRL agent. The agent's performance, encompassing path-following and collision avoidance, is systematically tested and evaluated within a stochastic simulation environment, presenting a comprehensive exploration of our proposed approach in maritime control systems.

Variational Autoencoders for exteroceptive perception in reinforcement learning-based collision avoidance

TL;DR

Abstract

Paper Structure (27 sections, 11 equations, 10 figures, 4 tables)

This paper contains 27 sections, 11 equations, 10 figures, 4 tables.

Introduction
Preliminaries
Vessel dynamics
Marine guidance
Deep reinforcement learning
Variational autoencoders
Methodology
DRL for path following and collision avoidance
State and action spaces
Reward function
VAE-based range sensor encoding
Data generation and augmentation
VAE architecture
Circularly padded transposed convolution
Experimental setup and performance evaluation
...and 12 more sections

Figures (10)

Figure 1: Illustration of the inertial and body-fixed reference frames relevant for vessel dynamics.
Figure 2: Graphical representation of the guidance features relevant for path-following.
Figure 3: High-level illustration of the proposed VAE+DRL system architecture and training protocols.
Figure 4: Range-finding sensor suite and its mapping from circular sensor data to a perception vector.
Figure 5: VAE reconstruction comparison between zero-padding and our adapted padding approach.
...and 5 more figures

Variational Autoencoders for exteroceptive perception in reinforcement learning-based collision avoidance

TL;DR

Abstract

Variational Autoencoders for exteroceptive perception in reinforcement learning-based collision avoidance

Authors

TL;DR

Abstract

Table of Contents

Figures (10)