Navigating the Smog: A Cooperative Multi-Agent RL for Accurate Air Pollution Mapping through Data Assimilation

Ichrak Mokhtari; Walid Bechkit; Mohamed Sami Assenine; Hervé Rivano

Navigating the Smog: A Cooperative Multi-Agent RL for Accurate Air Pollution Mapping through Data Assimilation

Ichrak Mokhtari, Walid Bechkit, Mohamed Sami Assenine, Hervé Rivano

TL;DR

The paper tackles real-time air pollution mapping by optimizing UAV measurement locations to enhance Data Assimilation (DA) without ground-truth data. It introduces a cooperative Multi-Agent Reinforcement Learning framework where autonomous drones act as agents in a Markov Game, using independent Q-learning with shared state and team rewards, and two credit assignment schemes: equal split and difference rewards; the latter is implemented via a C51 distributional DQN. The DA objective is expressed through a team reward $R_t = \mathbb{E}[x^{a}_{t-1} - x^{a}_{t}]$, guiding drones to collect informative measurements and accelerate convergence of $x^{a}$ toward the true state, with BLUE-based modeling and a realistic observation structure. Experiments on the FFT07 real-world dataset demonstrate substantial DA improvements, with the difference-rewards MARL achieving performance close to ground-truth-guided baselines and robust behavior under varying budgets and simulation quality. The work highlights scalable autonomous drone coordination for accurate pollution mapping and points to extensions to unsteady dispersion and other environmental monitoring challenges.

Abstract

The rapid rise of air pollution events necessitates accurate, real-time monitoring for informed mitigation strategies. Data Assimilation (DA) methods provide promising solutions, but their effectiveness hinges heavily on optimal measurement locations. This paper presents a novel approach for air quality mapping where autonomous drones, guided by a collaborative multi-agent reinforcement learning (MARL) framework, act as airborne detectives. Ditching the limitations of static sensor networks, the drones engage in a synergistic interaction, adapting their flight paths in real time to gather optimal data for Data Assimilation (DA). Our approach employs a tailored reward function with dynamic credit assignment, enabling drones to prioritize informative measurements without requiring unavailable ground truth data, making it practical for real-world deployments. Extensive experiments using a real-world dataset demonstrate that our solution achieves significantly improved pollution estimates, even with limited drone resources or limited prior knowledge of the pollution plume. Beyond air quality, this solution unlocks possibilities for tackling diverse environmental challenges like wildfire detection and management through scalable and autonomous drone cooperation.

Navigating the Smog: A Cooperative Multi-Agent RL for Accurate Air Pollution Mapping through Data Assimilation

TL;DR

, guiding drones to collect informative measurements and accelerate convergence of

toward the true state, with BLUE-based modeling and a realistic observation structure. Experiments on the FFT07 real-world dataset demonstrate substantial DA improvements, with the difference-rewards MARL achieving performance close to ground-truth-guided baselines and robust behavior under varying budgets and simulation quality. The work highlights scalable autonomous drone coordination for accurate pollution mapping and points to extensions to unsteady dispersion and other environmental monitoring challenges.

Abstract

Paper Structure (23 sections, 11 equations, 4 figures, 2 tables)

This paper contains 23 sections, 11 equations, 4 figures, 2 tables.

INTRODUCTION
Background
Data Assimilation
Best Linear Unbiased Estimator
Multi-Agent Reinforcement Learning
DRL-based UAVs Path Planning for Data Assimilation
Problem Description
Model Framework as Markov Games
Set of Agents
State
Action
State Transition
Reward Function
MARL Learning Design
Air Pollution Considerations
...and 8 more sections

Figures (4)

Figure 1: Illustration of the general data assimilation problem using a cooperative fleet of UAVs.
Figure 2: Agent networks ad reward structure of the proposed MARL framework.
Figure 3: The episodic mean ± std of one's and two agents' total mean absolute error for 5 random runs over $1e5$ training steps, with the evaluation performed every 1000 steps.
Figure 4: MAE obtained using 2 agents considering different budgets for 5 random runs. Results are presented on a logarithmic scale.

Navigating the Smog: A Cooperative Multi-Agent RL for Accurate Air Pollution Mapping through Data Assimilation

TL;DR

Abstract

Navigating the Smog: A Cooperative Multi-Agent RL for Accurate Air Pollution Mapping through Data Assimilation

Authors

TL;DR

Abstract

Table of Contents

Figures (4)