Table of Contents
Fetching ...

Multi-Scale RAFT: Combining Hierarchical Concepts for Learning-based Optical FLow Estimation

Azin Jahedi, Lukas Mehl, Marc Rivinius, Andrés Bruhn

TL;DR

RAfT, a leading optical flow method, relies on hierarchical cues but does not fully exploit multi-scale structure. This paper introduces Multi-Scale RAFT, which combines a 3-scale coarse-to-fine framework, U-Net-style multi-scale features, RAFT's correlation pyramid, and a robust multi-scale multi-iteration loss to enhance accuracy and robustness. Empirical results on Sintel and KITTI show substantial gains over RAFT and state-of-the-art performance, particularly in non-occluded regions, validating the value of integrating hierarchical concepts. The approach demonstrates strong generalization and efficiency, with code to be released for broader adoption.

Abstract

Many classical and learning-based optical flow methods rely on hierarchical concepts to improve both accuracy and robustness. However, one of the currently most successful approaches -- RAFT -- hardly exploits such concepts. In this work, we show that multi-scale ideas are still valuable. More precisely, using RAFT as a baseline, we propose a novel multi-scale neural network that combines several hierarchical concepts within a single estimation framework. These concepts include (i) a partially shared coarse-to-fine architecture, (ii) multi-scale features, (iii) a hierarchical cost volume and (iv) a multi-scale multi-iteration loss. Experiments on MPI Sintel and KITTI clearly demonstrate the benefits of our approach. They show not only substantial improvements compared to RAFT, but also state-of-the-art results -- in particular in non-occluded regions. Code will be available at https://github.com/cv-stuttgart/MS_RAFT.

Multi-Scale RAFT: Combining Hierarchical Concepts for Learning-based Optical FLow Estimation

TL;DR

RAfT, a leading optical flow method, relies on hierarchical cues but does not fully exploit multi-scale structure. This paper introduces Multi-Scale RAFT, which combines a 3-scale coarse-to-fine framework, U-Net-style multi-scale features, RAFT's correlation pyramid, and a robust multi-scale multi-iteration loss to enhance accuracy and robustness. Empirical results on Sintel and KITTI show substantial gains over RAFT and state-of-the-art performance, particularly in non-occluded regions, validating the value of integrating hierarchical concepts. The approach demonstrates strong generalization and efficiency, with code to be released for broader adoption.

Abstract

Many classical and learning-based optical flow methods rely on hierarchical concepts to improve both accuracy and robustness. However, one of the currently most successful approaches -- RAFT -- hardly exploits such concepts. In this work, we show that multi-scale ideas are still valuable. More precisely, using RAFT as a baseline, we propose a novel multi-scale neural network that combines several hierarchical concepts within a single estimation framework. These concepts include (i) a partially shared coarse-to-fine architecture, (ii) multi-scale features, (iii) a hierarchical cost volume and (iv) a multi-scale multi-iteration loss. Experiments on MPI Sintel and KITTI clearly demonstrate the benefits of our approach. They show not only substantial improvements compared to RAFT, but also state-of-the-art results -- in particular in non-occluded regions. Code will be available at https://github.com/cv-stuttgart/MS_RAFT.
Paper Structure (4 sections, 3 equations, 3 figures, 2 tables)

This paper contains 4 sections, 3 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Our coarse-to-fine architecture for 3 scales.
  • Figure 2: Our U-net-style feature extractor for 3 scales.
  • Figure 3: Computed optical flow for the test set of Sintel Clean and Final (top, middle) and KITTI (bottom), best viewed as PDF.