Table of Contents
Fetching ...

InverseNet: Benchmarking Operator Mismatch and Calibration Across Compressive Imaging Modalities

Chengshuai Yang, Xin Yuan

TL;DR

InverseNet is introduced, the first cross-modality benchmark for operator mismatch, spanning CASSI, CACTI, and single-pixel cameras, and real hardware experiments confirm that simulation trends transfer to physical data.

Abstract

State-of-the-art EfficientSCI loses 20.58 dB when its assumed forward operator deviates from physical reality in just eight parameters, yet no existing benchmark quantifies operator mismatch, the default condition in deployed compressive imaging systems. We introduce InverseNet, the first cross-modality benchmark for operator mismatch, spanning CASSI, CACTI, and single-pixel cameras. Evaluating 12 methods under a four-scenario protocol (ideal, mismatched, oracle-corrected, blind calibration) across 27 simulated scenes and 9 real hardware captures, we find: (1) deep learning methods lose 10-21 dB under mismatch, eliminating their advantage over classical baselines; (2) performance and robustness are inversely correlated across modalities (Spearman r_s = -0.71, p < 0.01); (3) mask-oblivious architectures recover 0% of mismatch losses regardless of calibration quality, while operator-conditioned methods recover 41-90%; (4) blind grid-search calibration recovers 85-100% of the oracle bound without ground truth. Real hardware experiments confirm that simulation trends transfer to physical data. Code will be released upon acceptance.

InverseNet: Benchmarking Operator Mismatch and Calibration Across Compressive Imaging Modalities

TL;DR

InverseNet is introduced, the first cross-modality benchmark for operator mismatch, spanning CASSI, CACTI, and single-pixel cameras, and real hardware experiments confirm that simulation trends transfer to physical data.

Abstract

State-of-the-art EfficientSCI loses 20.58 dB when its assumed forward operator deviates from physical reality in just eight parameters, yet no existing benchmark quantifies operator mismatch, the default condition in deployed compressive imaging systems. We introduce InverseNet, the first cross-modality benchmark for operator mismatch, spanning CASSI, CACTI, and single-pixel cameras. Evaluating 12 methods under a four-scenario protocol (ideal, mismatched, oracle-corrected, blind calibration) across 27 simulated scenes and 9 real hardware captures, we find: (1) deep learning methods lose 10-21 dB under mismatch, eliminating their advantage over classical baselines; (2) performance and robustness are inversely correlated across modalities (Spearman r_s = -0.71, p < 0.01); (3) mask-oblivious architectures recover 0% of mismatch losses regardless of calibration quality, while operator-conditioned methods recover 41-90%; (4) blind grid-search calibration recovers 85-100% of the oracle bound without ground truth. Real hardware experiments confirm that simulation trends transfer to physical data. Code will be released upon acceptance.
Paper Structure (50 sections, 8 equations, 6 figures, 7 tables)

This paper contains 50 sections, 8 equations, 6 figures, 7 tables.

Figures (6)

  • Figure 1: PSNR across three scenarios for all modalities. Scenario I (Ideal): perfect operator. Scenario II (Baseline): mismatched operator. Scenario III (Oracle): true operator used for reconstruction. The collapse of deep learning methods under Scenario II is visible across all modalities, with CACTI showing the most severe degradation.
  • Figure 2: Qualitative reconstruction comparison across three modalities. Each row shows a representative scene for one modality (CASSI: Scene 1 band 14, MST-L; CACTI: kobe frame 4, ELP-Unfolding; SPC: cameraman, HATNet). Error maps (jet colormap, same scale per row) highlight how mismatch (Scenario II) introduces spatially structured artifacts that oracle correction (Scenario III) largely removes.
  • Figure 3: Mismatch degradation ($\Delta_{\text{deg}}$, left) and oracle recovery ($\Delta_{\text{rec}}$, right) per method across all three modalities. CACTI suffers the most severe degradation (up to 20.6 dB) but also the highest absolute recovery. For CASSI, HDNet shows zero recovery due to its mask-oblivious architecture; PnP-HSICNN achieves the best recovery ratio ($\rho = 56.8\%$). For SPC, HATNet recovers 10.4 dB ($\rho = 89.6\%$).
  • Figure 4: Recovery ratio ($\rho$) vs. ideal PSNR (Scenario I) for all 12 methods across three modalities. Color indicates modality; shape indicates method type (classical, operator-aware, mask-oblivious). An inverse trend is visible: higher-performing methods tend to have lower recovery ratios, suggesting that stronger learned priors create greater operator dependence.
  • Figure 5: CASSI real data (Scene 1, band 14): calibrated (top) vs. mismatched (bottom) reconstructions for GAP-TV and PnP-HSICNN. The visual differences are subtle, consistent with the modest residual ratios in \ref{['tab:cassi_real']}---spatial mask shift alone is not the dominant degradation source for CASSI.
  • ...and 1 more figures