Table of Contents
Fetching ...

Quantum-Hybrid Stereo Matching With Nonlinear Regularization and Spatial Pyramids

Cameron Braunstein, Eddy Ilg, Vladislav Golyanik

TL;DR

The paper addresses stereo matching by formulating it as MAP inference over Markov Random Fields and mapping the objective to a QUBO solvable by quantum annealers, i.e., minimizing $E(oldsymbol{ell})$ via the quadratic form $oldsymbol{x}^{\top} Q \boldsymbol{x}$. It introduces a one-hot encoding with a rectifier function $\Lambda$ to enforce single-label assignments while enabling nonlinear regularizers for NP-hard optimization. On Middlebury, the method achieves RMSE improvements of about 2% to 22.5% over prior quantum stereo approaches, with classical optimization (e.g., $Gurobi$) providing strong performance in practice. The work demonstrates that nonlinear regularizers and a coarse-to-fine pyramid can be effectively mapped to quantum hardware, offering a path to apply quantum-hybrid optimization to other vision-energy problems such as optical flow or segmentation.

Abstract

Quantum visual computing is advancing rapidly. This paper presents a new formulation for stereo matching with nonlinear regularizers and spatial pyramids on quantum annealers as a maximum a posteriori inference problem that minimizes the energy of a Markov Random Field. Our approach is hybrid (i.e., quantum-classical) and is compatible with modern D-Wave quantum annealers, i.e., it includes a quadratic unconstrained binary optimization (QUBO) objective. Previous quantum annealing techniques for stereo matching are limited to using linear regularizers, and thus, they do not exploit the fundamental advantages of the quantum computing paradigm in solving combinatorial optimization problems. In contrast, our method utilizes the full potential of quantum annealing for stereo matching, as nonlinear regularizers create optimization problems which are NP-hard. On the Middlebury benchmark, we achieve an improved root mean squared accuracy over the previous state of the art in quantum stereo matching of 2% and 22.5% when using different solvers.

Quantum-Hybrid Stereo Matching With Nonlinear Regularization and Spatial Pyramids

TL;DR

The paper addresses stereo matching by formulating it as MAP inference over Markov Random Fields and mapping the objective to a QUBO solvable by quantum annealers, i.e., minimizing via the quadratic form . It introduces a one-hot encoding with a rectifier function to enforce single-label assignments while enabling nonlinear regularizers for NP-hard optimization. On Middlebury, the method achieves RMSE improvements of about 2% to 22.5% over prior quantum stereo approaches, with classical optimization (e.g., ) providing strong performance in practice. The work demonstrates that nonlinear regularizers and a coarse-to-fine pyramid can be effectively mapped to quantum hardware, offering a path to apply quantum-hybrid optimization to other vision-energy problems such as optical flow or segmentation.

Abstract

Quantum visual computing is advancing rapidly. This paper presents a new formulation for stereo matching with nonlinear regularizers and spatial pyramids on quantum annealers as a maximum a posteriori inference problem that minimizes the energy of a Markov Random Field. Our approach is hybrid (i.e., quantum-classical) and is compatible with modern D-Wave quantum annealers, i.e., it includes a quadratic unconstrained binary optimization (QUBO) objective. Previous quantum annealing techniques for stereo matching are limited to using linear regularizers, and thus, they do not exploit the fundamental advantages of the quantum computing paradigm in solving combinatorial optimization problems. In contrast, our method utilizes the full potential of quantum annealing for stereo matching, as nonlinear regularizers create optimization problems which are NP-hard. On the Middlebury benchmark, we achieve an improved root mean squared accuracy over the previous state of the art in quantum stereo matching of 2% and 22.5% when using different solvers.
Paper Structure (36 sections, 51 equations, 15 figures, 11 tables, 2 algorithms)

This paper contains 36 sections, 51 equations, 15 figures, 11 tables, 2 algorithms.

Figures (15)

  • Figure 1: Stereo estimates on the Middlebury middlebury2001 Venus stereo image pair. From left to right: Heidari et al.'s approach 9653310, our approach and ground truth. In this example, we achieve a $46\%$ decrease in root mean squared error (RMSE) from Heidari et al. and a $10\%$ decrease in bad pixel percentage (BPP). We avoid many of the streaking artifacts present in the result of the prior approach.
  • Figure 2: The first row shows the Left Image from each of the four Middlebury stereo pairs. The second row shows the Ground-Truth displacements for each pair. The remaining rows show results by our method using different optimizers: Gurobi, Simulated Annealing, D-Wave's Hybrid Quantum-Classical Solver, and the D-Wave's Pegasus QPU. The choice of optimizer has a strong influence on the result quality, and we observe that the traditional optimizer Gurobi outperforms all other methods. We hypothesize that this is because of the jagged and challenging energy landscape for the tested quantum annealer caused by our rectifiers and the current state of quantum hardware.
  • Figure 3: Stereo estimation on the Tsukuba image pair at the three resolution levels, before and after median filtering using the Gurobi solver. Median filtering helps to prevent cascading errors from lower resolution estimates. The final result after bilateral filtering is shown in \ref{['fig:optimizer_comparison']}.
  • Figure 4: Visual comparison of our method against 9653310 using Hybrid annealing, and a comparison of our method using Gurobi and the method used in 9653310 optimized using the classical Ford Fulkerson algorithm 10.5555/1942094. For both methods, we can see a marked visual improvement when classical optimizers are used. Gurobi provides a near-optimal solution, while Ford Fulkerson provides a global optimum.
  • Figure 5: Ablation study of the components of our algorithm. "No Regularizer" sets $E_s{=}0$ from line 12 of \ref{['alg:full_quantum_algorithm']}, which disables the regularizer and reverts to only using the data term. "Linear Regularizer" does not leverage truncation and is not edge-aware. "No Bilateral filter" removes the bilateral filtering in line 20 of \ref{['alg:full_quantum_algorithm']}. "No Median and No Bilateral filtering" row removes lines 18 and 20 from \ref{['alg:full_quantum_algorithm']}.
  • ...and 10 more figures