Deep Reinforcement Learning for Real-Time Ground Delay Program Revision and Corresponding Flight Delay Assignments

Ke Liu; Fan Hu; Hui Lin; Xi Cheng; Jianan Chen; Jilin Song; Siyuan Feng; Gaofeng Su; Chen Zhu

Deep Reinforcement Learning for Real-Time Ground Delay Program Revision and Corresponding Flight Delay Assignments

Ke Liu, Fan Hu, Hui Lin, Xi Cheng, Jianan Chen, Jilin Song, Siyuan Feng, Gaofeng Su, Chen Zhu

TL;DR

The paper addresses optimizing Ground Delay Programs (GDP) under uncertainty in the NAS. It applies two offline reinforcement learning approaches—Behavioral Cloning ($BC$) and Conservative Q-Learning ($CQL$)—within a time-sequential GDP simulation called SAGDP_ENV, using 2019 Newark data to adjust GDP parameters. The reward combines ground delays ($GD_{t+i}$), airborne delays ($AD_{t+i}$), and terminal-area congestion with defined costs ($c_{gnd}=1$, $c_{air}=2.5$, $p=10$) over a horizon of $n=8$ intervals. Findings show learning challenges due to oversimplified environmental modeling and data limitations, highlighting the need for more faithful weather integration and broader GDP parameterization to realize practical benefits in ATM.

Abstract

This paper explores the optimization of Ground Delay Programs (GDP), a prevalent Traffic Management Initiative used in Air Traffic Management (ATM) to reconcile capacity and demand discrepancies at airports. Employing Reinforcement Learning (RL) to manage the inherent uncertainties in the national airspace system-such as weather variability, fluctuating flight demands, and airport arrival rates-we developed two RL models: Behavioral Cloning (BC) and Conservative Q-Learning (CQL). These models are designed to enhance GDP efficiency by utilizing a sophisticated reward function that integrates ground and airborne delays and terminal area congestion. We constructed a simulated single-airport environment, SAGDP_ENV, which incorporates real operational data along with predicted uncertainties to facilitate realistic decision-making scenarios. Utilizing the whole year 2019 data from Newark Liberty International Airport (EWR), our models aimed to preemptively set airport program rates. Despite thorough modeling and simulation, initial outcomes indicated that the models struggled to learn effectively, attributed potentially to oversimplified environmental assumptions. This paper discusses the challenges encountered, evaluates the models' performance against actual operational data, and outlines future directions to refine RL applications in ATM.

Deep Reinforcement Learning for Real-Time Ground Delay Program Revision and Corresponding Flight Delay Assignments

TL;DR

The paper addresses optimizing Ground Delay Programs (GDP) under uncertainty in the NAS. It applies two offline reinforcement learning approaches—Behavioral Cloning (

) and Conservative Q-Learning (

)—within a time-sequential GDP simulation called SAGDP_ENV, using 2019 Newark data to adjust GDP parameters. The reward combines ground delays (

), airborne delays (

), and terminal-area congestion with defined costs (

) over a horizon of

intervals. Findings show learning challenges due to oversimplified environmental modeling and data limitations, highlighting the need for more faithful weather integration and broader GDP parameterization to realize practical benefits in ATM.

Abstract

Paper Structure (7 sections, 6 figures)

This paper contains 7 sections, 6 figures.

Introduction
Experiment Setup
Data Sources and Feature Extraction
Scenario Generation
TFM Agent
Experiment Results
Conclusions

Figures (6)

Figure 1: Statistics for GDP operation in 2019
Figure 2: Decision Process for GDP Initiation and Revisions.
Figure 3: Performance of BC agent.
Figure 4: Performance of CQL agent.
Figure 5: Performance of BC agent (n_ier = 1000).
...and 1 more figures

Deep Reinforcement Learning for Real-Time Ground Delay Program Revision and Corresponding Flight Delay Assignments

TL;DR

Abstract

Deep Reinforcement Learning for Real-Time Ground Delay Program Revision and Corresponding Flight Delay Assignments

Authors

TL;DR

Abstract

Table of Contents

Figures (6)