Amodal Optical Flow
Maximilian Luz, Rohit Mohan, Ahmed Rida Sekkat, Oliver Sawade, Elmar Matthes, Thomas Brox, Abhinav Valada
TL;DR
Amodal optical flow is introduced to capture motion for both visible and occluded scene elements through multi-layer motion fields and occlusion-aware stratification. The authors extend AmodalSynthDrive with ground-truth amodal flow and define AFQ as a unified metric for jointly evaluating flow and segmentation, expressed as $AFQ = \sqrt{mWAUC \cdot mIoU}$. They propose AmodalFlowNet, a transformer-based cost-volume encoder with a recurrent decoder that predicts per-layer motion fields and decomposed amodal masks with semantic grounding. AFQ is demonstrated to achieve state-of-the-art performance and improves panoptic tracking over baselines, highlighting practical value for robotics and dynamic scene understanding. The dataset, code, and trained models are released for public use.
Abstract
Optical flow estimation is very challenging in situations with transparent or occluded objects. In this work, we address these challenges at the task level by introducing Amodal Optical Flow, which integrates optical flow with amodal perception. Instead of only representing the visible regions, we define amodal optical flow as a multi-layered pixel-level motion field that encompasses both visible and occluded regions of the scene. To facilitate research on this new task, we extend the AmodalSynthDrive dataset to include pixel-level labels for amodal optical flow estimation. We present several strong baselines, along with the Amodal Flow Quality metric to quantify the performance in an interpretable manner. Furthermore, we propose the novel AmodalFlowNet as an initial step toward addressing this task. AmodalFlowNet consists of a transformer-based cost-volume encoder paired with a recurrent transformer decoder which facilitates recurrent hierarchical feature propagation and amodal semantic grounding. We demonstrate the tractability of amodal optical flow in extensive experiments and show its utility for downstream tasks such as panoptic tracking. We make the dataset, code, and trained models publicly available at http://amodal-flow.cs.uni-freiburg.de.
