Lane-Frame Quantum Multimodal Driving Forecasts for the Trajectory of Autonomous Vehicles
Navneet Singh, Shiva Raj Pokhrel
TL;DR
This work evaluates a compact hybrid quantum architecture for short-horizon, multi-modal trajectory forecasting in autonomous driving, operating in an ego-centric lane frame and predicting residuals over a kinematic baseline. It combines a 9-qubit quantum attention encoder, a deep but lightweight quantum feedforward network, and a Fourier-based quantum decoder that generates 16 trajectory hypotheses in a single pass, with spectrum-based confidences guiding ranking. Training employs gradient-free SPSA and a min-over-modes loss to achieve stable optimization and meaningful multi-modal forecasts on the Waymo Open Motion Dataset, outperforming a strong lane-following baseline. While not claiming quantum advantage, the study demonstrates that very small quantum circuits can be integrated into a resource-efficient forecasting pipeline, delivering meter-scale errors and diverse futures that are useful for real-time decision making. The results suggest a feasible path to leveraging quantum sub-modules in robotics and autonomous systems under tight compute budgets, with potential extensions to richer context and hardware deployment.
Abstract
Trajectory forecasting for autonomous driving must deliver accurate, calibrated multi-modal futures under tight compute and latency constraints. We propose a compact hybrid quantum architecture that aligns quantum inductive bias with road-scene structure by operating in an ego-centric, lane-aligned frame and predicting residual corrections to a kinematic baseline instead of absolute poses. The model combines a transformer-inspired quantum attention encoder (9 qubits), a parameter-lean quantum feedforward stack (64 layers, ${\sim}1200$ trainable angles), and a Fourier-based decoder that uses shallow entanglement and phase superposition to generate 16 trajectory hypotheses in a single pass, with mode confidences derived from the latent spectrum. All circuit parameters are trained with Simultaneous Perturbation Stochastic Approximation (SPSA), avoiding backpropagation through non-analytic components. In the Waymo Open Motion Dataset, the model achieves minADE (minimum Average Displacement Error) of \SI{1.94}{m} and minFDE (minimum Final Displacement Error) of \SI{3.56}{m} in the $16$ models predicted over the horizon of \SI{2.0}{s}, consistently outperforming a kinematic baseline with reduced miss rates and strong recall. Ablations confirm that residual learning in the lane frame, truncated Fourier decoding, shallow entanglement, and spectrum-based ranking focus capacity where it matters, yielding stable optimization and reliable multi-modal forecasts from small, shallow quantum circuits on a modern autonomous-driving benchmark.
