Motif Channel Opened in a White-Box: Stereo Matching via Motif Correlation Graph

Ziyang Chen; Yongjun Zhang; Wenting Li; Bingshu Wang; Yong Zhao; C. L. Philip Chen

Motif Channel Opened in a White-Box: Stereo Matching via Motif Correlation Graph

Ziyang Chen, Yongjun Zhang, Wenting Li, Bingshu Wang, Yong Zhao, C. L. Philip Chen

TL;DR

This work tackles the challenge of preserving geometric detail in learning-based stereo matching by introducing MoCha-V2, which uses a Motif Correlation Graph (MCG) to mine recurrent motif patterns across feature channels in a white-box fashion. By combining MCGA-based motif channels with a wavelet-based multi-frequency fusion and a reconstruction-based refinement (REMP), the method restores edge geometry while maintaining interpretability. The framework constructs a Motif Channel Correlation Volume (MCCV) and refines disparity through an iterative update and a full-resolution penalty, achieving state-of-the-art results on Scene Flow, Middlebury, KITTI, and ETH3D, with strong zero-shot generalization to Driving Stereo. While not fully transparent, MoCha-V2 advances explainable detail-learning in stereo and points to future work toward even more interpretable, safety-critical systems.

Abstract

Real-world applications of stereo matching, such as autonomous driving, place stringent demands on both safety and accuracy. However, learning-based stereo matching methods inherently suffer from the loss of geometric structures in certain feature channels, creating a bottleneck in achieving precise detail matching. Additionally, these methods lack interpretability due to the black-box nature of deep learning. In this paper, we propose MoCha-V2, a novel learning-based paradigm for stereo matching. MoCha-V2 introduces the Motif Correlation Graph (MCG) to capture recurring textures, which are referred to as ``motifs" within feature channels. These motifs reconstruct geometric structures and are learned in a more interpretable way. Subsequently, we integrate features from multiple frequency domains through wavelet inverse transformation. The resulting motif features are utilized to restore geometric structures in the stereo matching process. Experimental results demonstrate the effectiveness of MoCha-V2. MoCha-V2 achieved 1st place on the Middlebury benchmark at the time of its release. Code is available at https://github.com/ZYangChen/MoCha-Stereo.

Motif Channel Opened in a White-Box: Stereo Matching via Motif Correlation Graph

TL;DR

Abstract

Motif Channel Opened in a White-Box: Stereo Matching via Motif Correlation Graph

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (10)