Learning Coverage Paths in Unknown Environments with Deep Reinforcement Learning
Arvi Jonnarth, Jie Zhao, Michael Felsberg
TL;DR
This work tackles online coverage path planning (CPP) in unknown environments by learning continuous control policies with deep reinforcement learning. It introduces egocentric, multi-scale frontier maps and a novel total variation (TV) reward to promote complete coverage and reduce holes, implemented with a scale-grouped CNN and SAC. The approach outperforms previous RL-based methods and specialized CPP algorithms across exploration and lawn mowing variations, while remaining robust to sensor noise and scalable to large environments. The methods enable end-to-end learning of mapping, planning, and navigation, with practical implications for autonomous robotics in partially known settings.
Abstract
Coverage path planning (CPP) is the problem of finding a path that covers the entire free space of a confined area, with applications ranging from robotic lawn mowing to search-and-rescue. When the environment is unknown, the path needs to be planned online while mapping the environment, which cannot be addressed by offline planning methods that do not allow for a flexible path space. We investigate how suitable reinforcement learning is for this challenging problem, and analyze the involved components required to efficiently learn coverage paths, such as action space, input feature representation, neural network architecture, and reward function. We propose a computationally feasible egocentric map representation based on frontiers, and a novel reward term based on total variation to promote complete coverage. Through extensive experiments, we show that our approach surpasses the performance of both previous RL-based approaches and highly specialized methods across multiple CPP variations.
