A Reinforcement Learning Approach to Quiet and Safe UAM Traffic Management

Surya Murthy; John-Paul Clarke; Ufuk Topcu; Zhenyu Gao

A Reinforcement Learning Approach to Quiet and Safe UAM Traffic Management

Surya Murthy, John-Paul Clarke, Ufuk Topcu, Zhenyu Gao

TL;DR

The paper tackles quiet and safe urban air mobility traffic management by formulating a multi-agent MDp with altitude-change actions and a reward that combines $R_{noise}$ and $R_{separation}$. It utilizes the attention-enhanced D2MAV-A policy, trained with PPO in a centralized-to-decentralized framework, within BlueSky simulations and NASA RVLT-based noise data. The contributions include a joint noise-safety objective, a scalable RL framework for real-time control, and empirical insights from a South Austin network showing the tradeoffs between altitude distribution, congestion, LOS events, and ground noise. The findings demonstrate that altitude-based adjustments can balance environmental and safety goals, informing policy design for practical, city-scale UAM deployment.

Abstract

Urban air mobility (UAM) is a transformative system that operates various small aerial vehicles in urban environments to reshape urban transportation. However, integrating UAM into existing urban environments presents a variety of complex challenges. Recent analyses of UAM's operational constraints highlight aircraft noise and system safety as key hurdles to UAM system implementation. Future UAM air traffic management schemes must ensure that the system is both quiet and safe. We propose a multi-agent reinforcement learning approach to manage UAM traffic, aiming at both vertical separation assurance and noise mitigation. Through extensive training, the reinforcement learning agent learns to balance the two primary objectives by employing altitude adjustments in a multi-layer UAM network. The results reveal the tradeoffs among noise impact, traffic congestion, and separation. Overall, our findings demonstrate the potential of reinforcement learning in mitigating UAM's noise impact while maintaining safe separation using altitude adjustments

A Reinforcement Learning Approach to Quiet and Safe UAM Traffic Management

TL;DR

The paper tackles quiet and safe urban air mobility traffic management by formulating a multi-agent MDp with altitude-change actions and a reward that combines

and

. It utilizes the attention-enhanced D2MAV-A policy, trained with PPO in a centralized-to-decentralized framework, within BlueSky simulations and NASA RVLT-based noise data. The contributions include a joint noise-safety objective, a scalable RL framework for real-time control, and empirical insights from a South Austin network showing the tradeoffs between altitude distribution, congestion, LOS events, and ground noise. The findings demonstrate that altitude-based adjustments can balance environmental and safety goals, informing policy design for practical, city-scale UAM deployment.

Abstract

Paper Structure (19 sections, 11 equations, 6 figures, 1 table)

This paper contains 19 sections, 11 equations, 6 figures, 1 table.

Introduction
Background and Literature Review
Safe Operations in Aviation
Urban Air Mobility Noise Modeling and Mitigation
Methodology
Urban Air Mobility Network and Operations
Aircraft Noise Model
Urban Air Mobility Network as a Markov Decision Process
State and Action Space
Reward Function
Reinforcement Learning Model
Experiments
The South Austin Case Study
Simulation
Results
...and 4 more sections

Figures (6)

Figure 1: The configuration of the NASA RVLT vehicle and an example of its noise simulation (sources: rizzi2022predictionrizzi2023modeling)
Figure 2: NPD data for the NASA RVLT quadrotor vehicle: normal distance scale (left) and log distance scale with the fitted model (right)
Figure 3: The South Austin UAM network in the case study
Figure 4: Air traffic altitude distribution plots with varying tradeoff hyperparameter values from 0 to 0.9. Increasing emphasis on noise leads to increased traffic congestion at the highest altitudes.
Figure 5: Cumulative noise increase over ambient levels per experiment group. We observe that increasing emphasis on noise in the reward function leads to a decrease in noise impact.
...and 1 more figures

A Reinforcement Learning Approach to Quiet and Safe UAM Traffic Management

TL;DR

Abstract

A Reinforcement Learning Approach to Quiet and Safe UAM Traffic Management

Authors

TL;DR

Abstract

Table of Contents

Figures (6)