Flight Delay Prediction via Cross-Modality Adaptation of Large Language Models and Aircraft Trajectory Representation
Thaweerath Phisannupawong, Joshua Julian Damanik, Han-Lim Choi
TL;DR
The paper tackles the challenge of predicting post-terminal flight delays by formulating it as a multimodal regression task that fuses textual aeronautical information (flight plans, METAR/TAF, NOTAMs) with aircraft trajectory data. It introduces a cross-modality adaptation framework that encodes trajectories with a pre-trained ATSCC trajectory encoder, maps those embeddings into the language model space via a lightweight adapter, and processes the fused inputs with a frozen LLM followed by a regression head. Key contributions include the multi-source data integration, the cross-modal bridge between trajectory and language representations, and the demonstration of sub-minute prediction errors with real-time second-by-second updates, making the approach operationally viable for terminal-area ATM. The framework exhibits strong robustness across multiple small LLM backbones, highlights the importance of trajectory context (especially focusing trajectories) and textual aeronautical information, and offers practical scalability for real-world deployments and potential extension to other ATM tasks and modalities.
Abstract
Flight delay prediction has become a key focus in air traffic management, as delays highlight inefficiencies that impact overall network performance. This paper presents a lightweight large language model-based multimodal flight delay prediction, formulated from the perspective of air traffic controllers monitoring aircraft delay after entering the terminal area. The approach integrates trajectory representations with textual aeronautical information, including flight information, weather reports, and aerodrome notices, by adapting trajectory data into the language modality to capture airspace conditions. The experiments show that the model consistently achieves sub-minute prediction error by effectively leveraging contextual information related to the sources of delay, fulfilling the operational standard for minute-level precision. The framework demonstrates that linguistic understanding, when combined with cross-modality adaptation of trajectory data, enhances delay prediction. Moreover, the approach shows practicality and potential scalability for real-world operations, supporting real-time updates that refine predictions upon receiving new operational information.
