X-VORTEX: Spatio-Temporal Contrastive Learning for Wake Vortex Trajectory Forecasting
Zhan Qu, Michael Färber
TL;DR
X-VORTEX tackles the challenge of wake vortex analysis from sparse LiDAR data by learning physics-aware, spatio-temporal representations in a self-supervised manner. It introduces a two-view contrastive framework grounded in Augmentation Overlap Theory, combining temporal evolution with spatial sparsity through a time-distributed encoder and a temporal aggregator. The approach yields strong unsupervised representations, enables high-precision center localization with only $1\%$ of labels, and supports accurate short-horizon trajectory forecasting, outperforming heuristic, image-based, and fully supervised baselines. This workflow reduces labeling needs and provides a practical pathway toward real-time wake-vortex advisory capabilities with robustness to noise and decay, with potential applicability to other atmospheric flow phenomena captured by remote sensing.
Abstract
Wake vortices are strong, coherent air turbulences created by aircraft, and they pose a major safety and capacity challenge for air traffic management. Tracking how vortices move, weaken, and dissipate over time from LiDAR measurements is still difficult because scans are sparse, vortex signatures fade as the flow breaks down under atmospheric turbulence and instabilities, and point-wise annotation is prohibitively expensive. Existing approaches largely treat each scan as an independent, fully supervised segmentation problem, which overlooks temporal structure and does not scale to the vast unlabeled archives collected in practice. We present X-VORTEX, a spatio-temporal contrastive learning framework grounded in Augmentation Overlap Theory that learns physics-aware representations from unlabeled LiDAR point cloud sequences. X-VORTEX addresses two core challenges: sensor sparsity and time-varying vortex dynamics. It constructs paired inputs from the same underlying flight event by combining a weakly perturbed sequence with a strongly augmented counterpart produced via temporal subsampling and spatial masking, encouraging the model to align representations across missing frames and partial observations. Architecturally, a time-distributed geometric encoder extracts per-scan features and a sequential aggregator models the evolving vortex state across variable-length sequences. We evaluate on a real-world dataset of over one million LiDAR scans. X-VORTEX achieves superior vortex center localization while using only 1% of the labeled data required by supervised baselines, and the learned representations support accurate trajectory forecasting.
