Optimizing Plastic Waste Collection in Water Bodies Using Heterogeneous Autonomous Surface Vehicles with Deep Reinforcement Learning

Alejandro Mendoza Barrionuevo; Samuel Yanes Luis; Daniel Gutiérrez Reina; Sergio L. Toral Marín

Optimizing Plastic Waste Collection in Water Bodies Using Heterogeneous Autonomous Surface Vehicles with Deep Reinforcement Learning

Alejandro Mendoza Barrionuevo, Samuel Yanes Luis, Daniel Gutiérrez Reina, Sergio L. Toral Marín

TL;DR

Addresses the challenge of locating and collecting floating plastic waste in aquatic environments. The authors propose a model-free DRL framework for informative path planning over a heterogeneous fleet of ASVs divided into scouts and cleaners, coordinated via a shared trash model. The approach introduces a specialized state representation and a tailored reward design, implemented as two team-specific networks using Double Deep Q-Learning with prioritized replay. Results across two port-like scenarios show DRL-based methods outperform heuristics, especially in complex layouts, and training with Greedy actions further enhances performance, suggesting strong practical potential despite higher inference cost.

Abstract

This paper presents a model-free deep reinforcement learning framework for informative path planning with heterogeneous fleets of autonomous surface vehicles to locate and collect plastic waste. The system employs two teams of vehicles: scouts and cleaners. Coordination between these teams is achieved through a deep reinforcement approach, allowing agents to learn strategies to maximize cleaning efficiency. The primary objective is for the scout team to provide an up-to-date contamination model, while the cleaner team collects as much waste as possible following this model. This strategy leads to heterogeneous teams that optimize fleet efficiency through inter-team cooperation supported by a tailored reward function. Different trainings of the proposed algorithm are compared with other state-of-the-art heuristics in two distinct scenarios, one with high convexity and another with narrow corridors and challenging access. According to the obtained results, it is demonstrated that deep reinforcement learning based algorithms outperform other benchmark heuristics, exhibiting superior adaptability. In addition, training with greedy actions further enhances performance, particularly in scenarios with intricate layouts.

Optimizing Plastic Waste Collection in Water Bodies Using Heterogeneous Autonomous Surface Vehicles with Deep Reinforcement Learning

TL;DR

Abstract

Optimizing Plastic Waste Collection in Water Bodies Using Heterogeneous Autonomous Surface Vehicles with Deep Reinforcement Learning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)