OptFlow: Fast Optimization-based Scene Flow Estimation without Supervision
Rahul Ahuja, Chris Baker, Wilko Schwarting
TL;DR
OptFlow addresses the problem of estimating 3D scene flow without supervision by formulating a fast optimization over a flow field $F$ and an ego-motion transform $T$. It introduces a local correlation weight matrix, an adaptive distance threshold, a rigidity constraint via a graph Laplacian, and an ICP-based ego-motion term in the objective $E_{obj}=E_{fit}+ \alpha_{rigid} E_{rigid}$, enabling rapid convergence and robust correspondence. The method achieves state-of-the-art accuracy among non-learning approaches on major autonomous driving benchmarks while delivering the fastest inference times in this class, and its utility extends to densification and motion segmentation without external odometry data. These properties support practical deployment in perception pipelines where labeled data and training are limited or unavailable, across diverse datasets and sensor configurations.
Abstract
Scene flow estimation is a crucial component in the development of autonomous driving and 3D robotics, providing valuable information for environment perception and navigation. Despite the advantages of learning-based scene flow estimation techniques, their domain specificity and limited generalizability across varied scenarios pose challenges. In contrast, non-learning optimization-based methods, incorporating robust priors or regularization, offer competitive scene flow estimation performance, require no training, and show extensive applicability across datasets, but suffer from lengthy inference times. In this paper, we present OptFlow, a fast optimization-based scene flow estimation method. Without relying on learning or any labeled datasets, OptFlow achieves state-of-the-art performance for scene flow estimation on popular autonomous driving benchmarks. It integrates a local correlation weight matrix for correspondence matching, an adaptive correspondence threshold limit for nearest-neighbor search, and graph prior rigidity constraints, resulting in expedited convergence and improved point correspondence identification. Moreover, we demonstrate how integrating a point cloud registration function within our objective function bolsters accuracy and differentiates between static and dynamic points without relying on external odometry data. Consequently, OptFlow outperforms the baseline graph-prior method by approximately 20% and the Neural Scene Flow Prior method by 5%-7% in accuracy, all while offering the fastest inference time among all non-learning scene flow estimation methods.
