RMS-FlowNet++: Efficient and Robust Multi-Scale Scene Flow Estimation for Large-Scale Point Clouds
Ramy Battrawy, René Schuster, Didier Stricker
TL;DR
RMS-FlowNet++ tackles the challenge of estimating 3D scene flow on dense point clouds with high efficiency. It introduces a Patch-to-Dilated-Patch flow embedding and a Random-Sampling enabled hierarchical design that reduces the size of the correspondence set while maintaining accuracy, enabling processing of tens to hundreds of thousands of points without full-resolution matching. The method demonstrates competitive accuracy and superior generalization to KITTI without fine-tuning, along with robust performance under occlusions and at long ranges up to 210 meters. This work advances scalable, accurate scene flow estimation for large-scale LiDAR data, with practical impact on autonomous driving and robust 3D motion understanding.
Abstract
The proposed RMS-FlowNet++ is a novel end-to-end learning-based architecture for accurate and efficient scene flow estimation that can operate on high-density point clouds. For hierarchical scene f low estimation, existing methods rely on expensive Farthest-Point-Sampling (FPS) to sample the scenes, must find large correspondence sets across the consecutive frames and/or must search for correspondences at a full input resolution. While this can improve the accuracy, it reduces the overall efficiency of these methods and limits their ability to handle large numbers of points due to memory requirements. In contrast to these methods, our architecture is based on an efficient design for hierarchical prediction of multi-scale scene flow. To this end, we develop a special flow embedding block that has two advantages over the current methods: First, a smaller correspondence set is used, and second, the use of Random-Sampling (RS) is possible. In addition, our architecture does not need to search for correspondences at a full input resolution. Exhibiting high accuracy, our RMS-FlowNet++ provides a faster prediction than state-of-the-art methods, avoids high memory requirements and enables efficient scene flow on dense point clouds of more than 250K points at once. Our comprehensive experiments verify the accuracy of RMS FlowNet++ on the established FlyingThings3D data set with different point cloud densities and validate our design choices. Furthermore, we demonstrate that our model has a competitive ability to generalize to the real-world scenes of the KITTI data set without fine-tuning.
