Predicting Future Spatiotemporal Occupancy Grids with Semantics for Autonomous Driving
Maneekwan Toyungyernsub, Esen Yel, Jiachen Li, Mykel J. Kochenderfer
TL;DR
The paper tackles the challenge of predicting future scenes for autonomous driving by integrating environment semantics into occupancy grid forecasting. It introduces a two-module framework: an upstream SMGM predictor that forecasts semantic grids and a downstream occupancy predictor (based on a modified PredNet) that uses these semantic cues to generate future OGMs via an evidential occupancy representation with DST-based updates. The approach is validated on Waymo Open Dataset v1.4.0, showing higher accuracy and better preservation of moving objects over 1.5 s horizons than strong baselines like PredNet and a dynamics-aware Double-Prong model. The results highlight the practical value of incorporating semantic context for proactive trajectory planning and safer navigation, with future work aiming to jointly predict semantics and occupancy to reduce model size.
Abstract
For autonomous vehicles to proactively plan safe trajectories and make informed decisions, they must be able to predict the future occupancy states of the local environment. However, common issues with occupancy prediction include predictions where moving objects vanish or become blurred, particularly at longer time horizons. We propose an environment prediction framework that incorporates environment semantics for future occupancy prediction. Our method first semantically segments the environment and uses this information along with the occupancy information to predict the spatiotemporal evolution of the environment. We validate our approach on the real-world Waymo Open Dataset. Compared to baseline methods, our model has higher prediction accuracy and is capable of maintaining moving object appearances in the predictions for longer prediction time horizons.
