Variational Autoencoders for exteroceptive perception in reinforcement learning-based collision avoidance
Thomas Nakken Larsen, Eirik Runde Barlaug, Adil Rasheed
TL;DR
This paper addresses autonomous maritime navigation by integrating exteroceptive perception with deep reinforcement learning. It proposes using variational autoencoders to compress high-fidelity LiDAR-style range data into a low-dimensional latent feature that serves as input to a PPO-based DRL agent for simultaneous path following and collision avoidance. The study demonstrates that a shallow, pre-trained VAE encoder, coupled with circular padding in the decoder and β-regularization to avoid posterior collapse, improves path adherence and trajectory efficiency relative to a non-VAE baseline, while maintaining safe collision rates. These findings advance the practical deployment of VAE-augmented DRL in marine control systems by reducing input dimensionality and stabilizing latent representations, enabling robust exteroceptive perception in dynamic environments.
Abstract
Modern control systems are increasingly turning to machine learning algorithms to augment their performance and adaptability. Within this context, Deep Reinforcement Learning (DRL) has emerged as a promising control framework, particularly in the domain of marine transportation. Its potential for autonomous marine applications lies in its ability to seamlessly combine path-following and collision avoidance with an arbitrary number of obstacles. However, current DRL algorithms require disproportionally large computational resources to find near-optimal policies compared to the posed control problem when the searchable parameter space becomes large. To combat this, our work delves into the application of Variational AutoEncoders (VAEs) to acquire a generalized, low-dimensional latent encoding of a high-fidelity range-finding sensor, which serves as the exteroceptive input to a DRL agent. The agent's performance, encompassing path-following and collision avoidance, is systematically tested and evaluated within a stochastic simulation environment, presenting a comprehensive exploration of our proposed approach in maritime control systems.
