Table of Contents
Fetching ...

OptFlow: Fast Optimization-based Scene Flow Estimation without Supervision

Rahul Ahuja, Chris Baker, Wilko Schwarting

TL;DR

OptFlow addresses the problem of estimating 3D scene flow without supervision by formulating a fast optimization over a flow field $F$ and an ego-motion transform $T$. It introduces a local correlation weight matrix, an adaptive distance threshold, a rigidity constraint via a graph Laplacian, and an ICP-based ego-motion term in the objective $E_{obj}=E_{fit}+ \alpha_{rigid} E_{rigid}$, enabling rapid convergence and robust correspondence. The method achieves state-of-the-art accuracy among non-learning approaches on major autonomous driving benchmarks while delivering the fastest inference times in this class, and its utility extends to densification and motion segmentation without external odometry data. These properties support practical deployment in perception pipelines where labeled data and training are limited or unavailable, across diverse datasets and sensor configurations.

Abstract

Scene flow estimation is a crucial component in the development of autonomous driving and 3D robotics, providing valuable information for environment perception and navigation. Despite the advantages of learning-based scene flow estimation techniques, their domain specificity and limited generalizability across varied scenarios pose challenges. In contrast, non-learning optimization-based methods, incorporating robust priors or regularization, offer competitive scene flow estimation performance, require no training, and show extensive applicability across datasets, but suffer from lengthy inference times. In this paper, we present OptFlow, a fast optimization-based scene flow estimation method. Without relying on learning or any labeled datasets, OptFlow achieves state-of-the-art performance for scene flow estimation on popular autonomous driving benchmarks. It integrates a local correlation weight matrix for correspondence matching, an adaptive correspondence threshold limit for nearest-neighbor search, and graph prior rigidity constraints, resulting in expedited convergence and improved point correspondence identification. Moreover, we demonstrate how integrating a point cloud registration function within our objective function bolsters accuracy and differentiates between static and dynamic points without relying on external odometry data. Consequently, OptFlow outperforms the baseline graph-prior method by approximately 20% and the Neural Scene Flow Prior method by 5%-7% in accuracy, all while offering the fastest inference time among all non-learning scene flow estimation methods.

OptFlow: Fast Optimization-based Scene Flow Estimation without Supervision

TL;DR

OptFlow addresses the problem of estimating 3D scene flow without supervision by formulating a fast optimization over a flow field and an ego-motion transform . It introduces a local correlation weight matrix, an adaptive distance threshold, a rigidity constraint via a graph Laplacian, and an ICP-based ego-motion term in the objective , enabling rapid convergence and robust correspondence. The method achieves state-of-the-art accuracy among non-learning approaches on major autonomous driving benchmarks while delivering the fastest inference times in this class, and its utility extends to densification and motion segmentation without external odometry data. These properties support practical deployment in perception pipelines where labeled data and training are limited or unavailable, across diverse datasets and sensor configurations.

Abstract

Scene flow estimation is a crucial component in the development of autonomous driving and 3D robotics, providing valuable information for environment perception and navigation. Despite the advantages of learning-based scene flow estimation techniques, their domain specificity and limited generalizability across varied scenarios pose challenges. In contrast, non-learning optimization-based methods, incorporating robust priors or regularization, offer competitive scene flow estimation performance, require no training, and show extensive applicability across datasets, but suffer from lengthy inference times. In this paper, we present OptFlow, a fast optimization-based scene flow estimation method. Without relying on learning or any labeled datasets, OptFlow achieves state-of-the-art performance for scene flow estimation on popular autonomous driving benchmarks. It integrates a local correlation weight matrix for correspondence matching, an adaptive correspondence threshold limit for nearest-neighbor search, and graph prior rigidity constraints, resulting in expedited convergence and improved point correspondence identification. Moreover, we demonstrate how integrating a point cloud registration function within our objective function bolsters accuracy and differentiates between static and dynamic points without relying on external odometry data. Consequently, OptFlow outperforms the baseline graph-prior method by approximately 20% and the Neural Scene Flow Prior method by 5%-7% in accuracy, all while offering the fastest inference time among all non-learning scene flow estimation methods.
Paper Structure (23 sections, 9 equations, 7 figures, 4 tables)

This paper contains 23 sections, 9 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Graph depicting flow accuracy $Acc_5$vs. inference time in seconds with 2048 and 8192 points used. OptFlow is the fastest algorithm while achieving state-of-the-art results among all the non-learning-based methods. The experiments were run on an NVIDIA Tesla T4 GPU.
  • Figure 2: Visualization of predicted flows on the KITTI Dataset. Left: The color-coded map illustrates a comparison between ground truth flow vectors and our predicted flow values. Top Right: The point cloud $P_{T-1}$ is depicted in red, point cloud $P_{T}$ in green, and the translated point cloud $(P_{T-1}+F)$ in blue. Note the proximity of the blue points to the ground truth green points, indicating high prediction accuracy. Bottom Right: The visualization of predicted flow lines is presented.
  • Figure 3: Visualization of predicted flows on the nuScenes and Argoverse datasets.Left: A color-coded map illustrates a comparison between the ground truth flow vectors and our predicted flow values. Top Right: The point cloud $P_{T-1}$ is depicted in red, point cloud $P_{T}$ in green, and the transformed point cloud $P_{T-1}+F$ in blue. Bottom Right: The visualization of predicted flow lines is presented. The nuScenes example is particularly intricate, as the ego-vehicle is positioned at a turn, resulting in angled flows. Additionally, the color-coded map illustrates the variance in flow values associated with each point.
  • Figure 4: Performance of OptFlow algorithm on KITTI with different iterations and time taken for them. The red arrow shows the most optimal performance.
  • Figure 5: Performance($Acc_{5}$) comparison of our algorithm and NSFP on the KITTI dataset for varying point cloud densities. This shows we get around 7x speedup over NSFP nsfp as the point cloud density increases. The point clouds are processed parallelly after 8k points as discussed in sec. 1 of supplementary.
  • ...and 2 more figures