Table of Contents
Fetching ...

SR-Stereo & DAPE: Stepwise Regression and Pre-trained Edges for Practical Stereo Matching

Weiqing Xiao, Wei Zhao

TL;DR

A novel stepwise regression architecture is proposed that regresses the disparity error through multiple range-controlled clips, which effectively overcomes domain discrepancies and achieves competitive in-domain and cross-domain performances.

Abstract

Due to the difficulty in obtaining real samples and ground truth, the generalization performance and domain adaptation performance are critical for the feasibility of stereo matching methods in practical applications. However, there are significant distributional discrepancies among different domains, which pose challenges for generalization and domain adaptation of the model. Inspired by the iteration-based methods, we propose a novel stepwise regression architecture. This architecture regresses the disparity error through multiple range-controlled clips, which effectively overcomes domain discrepancies. We implement this architecture based on the iterative-based methods, and refer to this new stereo method as SR-Stereo. Specifically, a new stepwise regression unit is proposed to replace the original update unit in order to control the range of output. Meanwhile, a regression objective segment is proposed to set the supervision individually for each stepwise regression unit. In addition, to enhance the edge awareness of models adapting new domains with sparse ground truth, we propose Domain Adaptation based on Pre-trained Edges (DAPE). In DAPE, a pre-trained stereo model and an edge estimator are used to estimate the edge maps of the target domain images, which along with the sparse ground truth disparity are used to fine-tune the stereo model. The proposed SR-Stereo and DAPE are extensively evaluated on SceneFlow, KITTI, Middbury 2014 and ETH3D. Compared with the SOTA methods and generalized methods, the proposed SR-Stereo achieves competitive in-domain and cross-domain performances. Meanwhile, the proposed DAPE significantly improves the performance of the fine-tuned model, especially in the texture-less and detailed regions.

SR-Stereo & DAPE: Stepwise Regression and Pre-trained Edges for Practical Stereo Matching

TL;DR

A novel stepwise regression architecture is proposed that regresses the disparity error through multiple range-controlled clips, which effectively overcomes domain discrepancies and achieves competitive in-domain and cross-domain performances.

Abstract

Due to the difficulty in obtaining real samples and ground truth, the generalization performance and domain adaptation performance are critical for the feasibility of stereo matching methods in practical applications. However, there are significant distributional discrepancies among different domains, which pose challenges for generalization and domain adaptation of the model. Inspired by the iteration-based methods, we propose a novel stepwise regression architecture. This architecture regresses the disparity error through multiple range-controlled clips, which effectively overcomes domain discrepancies. We implement this architecture based on the iterative-based methods, and refer to this new stereo method as SR-Stereo. Specifically, a new stepwise regression unit is proposed to replace the original update unit in order to control the range of output. Meanwhile, a regression objective segment is proposed to set the supervision individually for each stepwise regression unit. In addition, to enhance the edge awareness of models adapting new domains with sparse ground truth, we propose Domain Adaptation based on Pre-trained Edges (DAPE). In DAPE, a pre-trained stereo model and an edge estimator are used to estimate the edge maps of the target domain images, which along with the sparse ground truth disparity are used to fine-tune the stereo model. The proposed SR-Stereo and DAPE are extensively evaluated on SceneFlow, KITTI, Middbury 2014 and ETH3D. Compared with the SOTA methods and generalized methods, the proposed SR-Stereo achieves competitive in-domain and cross-domain performances. Meanwhile, the proposed DAPE significantly improves the performance of the fine-tuned model, especially in the texture-less and detailed regions.
Paper Structure (33 sections, 31 equations, 15 figures, 10 tables)

This paper contains 33 sections, 31 equations, 15 figures, 10 tables.

Figures (15)

  • Figure 1: Comparison of reconstructed point clouds on KITTI. All methods are trained on SceneFlow and fine-tuned on KITTI. During inference, all methods run 15 disparity updates. Our SR-Stereo performs better in the detailed regions. In addition, the proposed fine-tuning framework DAPE effectively improves the performance of existing methods fine-tuned on sparse ground truth.
  • Figure 2: The disparity update process of iteration-based methods. In this process, the update unit outputs the residual disparity to update the current disparity. The $//$ means stop gradient.
  • Figure 3: The overall architecture of the proposed SR-Stereo. Compared to iteration-based methods, SR-Stereo is specially designed in terms of the update unit and the regression objective. Specifically, we propose a stepwise regression unit that outputs range-controlled disparity clips, rather than unconstrained residual disparities. Further, we design separate regression objectives for each stepwise regression unit, instead of simply using the disparity error.
  • Figure 4: Visualization of the regression objectives of SR-Stereo and iteration-based methods. The iteration-based methods regress disparity error by predicting residual disparity $\Delta d_{k}$, while SR-Stereo splits the disparity error into multiple segments and regresses them by predicting multiple disparity clips.
  • Figure 5: Architecture of the stepwise regression unit. The $m$ is the hyperparameter that controls the range of the output disparity clip. The $\Delta d_{k,ori}$ is the residual disparity output of the original update unit, and $Res$ is the residual layer.
  • ...and 10 more figures