Deciphering Air Travel Disruptions: A Machine Learning Approach
Aravinda Jatavallabha, Jacob Gerlach, Aadithya Naresh
TL;DR
Flight Delay Prediction investigates predicting individual delay components using ML, comparing time-series models (LSTM, BiLSTM, LSTM-CNN) against baseline regressors on the DOT BTS dataset (2019–2023). Predictions are evaluated with $MAE$ and $MSE$, highlighting modest gains from time-series approaches but limited accuracy for ARR_DELAY due to skewed distributions and pandemic-related disruptions. The study emphasizes model explainability through component-level predictions and reveals challenges in forecasting aviation delays with high reliability. This work informs aviation operations and planning by providing insights into the predictability of specific delay sources and the potential value of time-series modeling for proactive flight scheduling.
Abstract
This research investigates flight delay trends by examining factors such as departure time, airline, and airport. It employs regression machine learning methods to predict the contributions of various sources to delays. Time-series models, including LSTM, Hybrid LSTM, and Bi-LSTM, are compared with baseline regression models such as Multiple Regression, Decision Tree Regression, Random Forest Regression, and Neural Network. Despite considerable errors in the baseline models, the study aims to identify influential features in delay prediction, potentially informing flight planning strategies. Unlike previous work, this research focuses on regression tasks and explores the use of time-series models for predicting flight delays. It offers insights into aviation operations by independently analyzing each delay component (e.g., security, weather).
