Autonomous Vehicle Patrolling Through Deep Reinforcement Learning: Learning to Communicate and Cooperate

Chenhao Tong; Maria A. Rodriguez; Richard O. Sinnott

Autonomous Vehicle Patrolling Through Deep Reinforcement Learning: Learning to Communicate and Cooperate

Chenhao Tong, Maria A. Rodriguez, Richard O. Sinnott

TL;DR

This work tackles autonomous multi-agent patrolling under environmental dynamics and potential agent failures by enabling agents to learn to communicate and coordinate via reinforced inter-agent learning (RIAL) within a MAPPO framework. The approach decomposes rewards into patrolling, battery management, and collision avoidance, with a curriculum that gradually scales coordination complexity and a battery hot-swapping mechanism to sustain operation. Empirical results on four patrolling maps with 1–5 agents demonstrate that the proposed RL-MSG method outperforms baselines in patrol efficiency, safety, and fault tolerance, while maintaining low battery failure rates. Overall, the framework provides a scalable, robust solution for real-world autonomous patrolling tasks that require dynamic cooperation and communication among multiple agents.

Abstract

Autonomous vehicles are suited for continuous area patrolling problems. Finding an optimal patrolling strategy can be challenging due to unknown environmental factors, such as wind or landscape; or autonomous vehicles' constraints, such as limited battery life or hardware failures. Importantly, patrolling large areas often requires multiple agents to collectively coordinate their actions. However, an optimal coordination strategy is often non-trivial to be manually defined due to the complex nature of patrolling environments. In this paper, we consider a patrolling problem with environmental factors, agent limitations, and three typical cooperation problems -- collision avoidance, congestion avoidance, and patrolling target negotiation. We propose a multi-agent reinforcement learning solution based on a reinforced inter-agent learning (RIAL) method. With this approach, agents are trained to develop their own communication protocol to cooperate during patrolling where faults can and do occur. The solution is validated through simulation experiments and is compared with several state-of-the-art patrolling solutions from different perspectives, including the overall patrol performance, the collision avoidance performance, the efficiency of battery recharging strategies, and the overall fault tolerance.

Autonomous Vehicle Patrolling Through Deep Reinforcement Learning: Learning to Communicate and Cooperate

TL;DR

Abstract

Paper Structure (18 sections, 6 equations, 6 figures, 11 tables)

This paper contains 18 sections, 6 equations, 6 figures, 11 tables.

Introduction
Related Work
Problem Modelling
Methods
System Architecture
Reward Function ($R$)
Patrolling performance ($R_p$)
Battery usage ($R_b$)
Collision Avoidance ($R_c$)
The Learning Algorithm
Performance evaluation
Model Training
Battery Recharging Performance Evaluation
Patrolling Performance Evaluation
Collision/Congestion Avoidance Performance Evaluation
...and 3 more sections

Figures (6)

Figure 1: Examples of multi-agent cooperation problems. The circle represents the agent, and the arrow represents the agents desired movement.
Figure 2: An example grid map (a), its corresponding matrix expression (b) from agent A's perspective, its priority matrix (c), and its idleness matrix at timestep 0 (d).
Figure 3: Function plot of $f(i) = -e^{-\frac{i}{10}} + 1$, $i \in [0,100]$, $c_{norm} = 10$
Figure 4: Architecture of the Actor Network ( \ref{['fig:msg']}\ref{['fig:actor']}) and the Critic Network (\ref{['fig:critic']}) in the MAPPO. The arrow represents the direction of the data flow.
Figure 5: Four patrolling maps. The numbers represent the priorities of vertices, while the un-numbered vertices have priority 0 (normal priority).
...and 1 more figures

Autonomous Vehicle Patrolling Through Deep Reinforcement Learning: Learning to Communicate and Cooperate

TL;DR

Abstract

Autonomous Vehicle Patrolling Through Deep Reinforcement Learning: Learning to Communicate and Cooperate

Authors

TL;DR

Abstract

Table of Contents

Figures (6)