Table of Contents
Fetching ...

Locally-Supervised Global Image Restoration

Benjamin Walder, Daniel Toader, Robert Nuster, Günther Paltauf, Peter Burgholzer, Gregor Langer, Lukas Krainer, Markus Haltmeier

TL;DR

This work tackles image reconstruction from deterministic, incomplete measurements by proposing a locally-supervised global restoration framework that exploits translation and other image invariances. The core idea is to learn a $\mathcal{T}$-equivariant upsampling function $f$ using supervision on a small fixed subset $B$ of pixels, while minimizing $\mathbb{E}\bigl[\| M_B \odot f(X_\Omega) - X_B \|^2\bigr]$, which, under $\mathcal{T}$-invariance, yields the optimal $\mathbb{E}[X|X_\Omega]$. The paper develops the theoretical underpinnings (translation invariance, equivariance, and TI-corollaries) and demonstrates the approach on OR-PAM, where sparse-dense sampling with a fixed supervision region achieves near-supervised performance while saving substantial acquisition time. Empirically, the method outperforms patch-wise upsampling and enables efficient global restoration from a small, fixed set of supervised pixels, offering practical benefits for accelerated high-resolution imaging and a pathway to extendable, deterministic-sampling workflows.

Abstract

We address the problem of image reconstruction from incomplete measurements, encompassing both upsampling and inpainting, within a learning-based framework. Conventional supervised approaches require fully sampled ground truth data, while self-supervised methods allow incomplete ground truth but typically rely on random sampling that, in expectation, covers the entire image. In contrast, we consider fixed, deterministic sampling patterns with inherently incomplete coverage, even in expectation. To overcome this limitation, we exploit multiple invariances of the underlying image distribution, which theoretically allows us to achieve the same reconstruction performance as fully supervised approaches. We validate our method on optical-resolution image upsampling in photoacoustic microscopy (PAM), demonstrating competitive or superior results while requiring substantially less ground truth data.

Locally-Supervised Global Image Restoration

TL;DR

This work tackles image reconstruction from deterministic, incomplete measurements by proposing a locally-supervised global restoration framework that exploits translation and other image invariances. The core idea is to learn a -equivariant upsampling function using supervision on a small fixed subset of pixels, while minimizing , which, under -invariance, yields the optimal . The paper develops the theoretical underpinnings (translation invariance, equivariance, and TI-corollaries) and demonstrates the approach on OR-PAM, where sparse-dense sampling with a fixed supervision region achieves near-supervised performance while saving substantial acquisition time. Empirically, the method outperforms patch-wise upsampling and enables efficient global restoration from a small, fixed set of supervised pixels, offering practical benefits for accelerated high-resolution imaging and a pathway to extendable, deterministic-sampling workflows.

Abstract

We address the problem of image reconstruction from incomplete measurements, encompassing both upsampling and inpainting, within a learning-based framework. Conventional supervised approaches require fully sampled ground truth data, while self-supervised methods allow incomplete ground truth but typically rely on random sampling that, in expectation, covers the entire image. In contrast, we consider fixed, deterministic sampling patterns with inherently incomplete coverage, even in expectation. To overcome this limitation, we exploit multiple invariances of the underlying image distribution, which theoretically allows us to achieve the same reconstruction performance as fully supervised approaches. We validate our method on optical-resolution image upsampling in photoacoustic microscopy (PAM), demonstrating competitive or superior results while requiring substantially less ground truth data.

Paper Structure

This paper contains 26 sections, 4 theorems, 10 equations, 6 figures, 1 table.

Key Result

Proposition 2.1

Let $\Omega, \Lambda \subseteq I$ be random subsets with $\mathbb{E}[M_\Omega] > 0$ and $\mathbb{E}[M_\Lambda] < 1$. Then

Figures (6)

  • Figure 4.1: Visualization of the performance of a neural network trained with sparse-dense training images. First column: Image of measured pixels only with size $64\times64$ pixels. Second column: Output image of a network trained with sparse-dense training images. Third column: Output image of a network trained fully supervised. Fourth column: Ground truth image of size $128\times128$ pixels.
  • Figure 4.2: Visualization of the performance of neural networks trained with different supervision patch sizes. First column: Output image of a network trained with a supervision patch of size $2\times2$ pixels. Second column: Output image of a network trained with a supervision patch of size $8\times8$ pixels. Third column: Output image of a network trained with a supervision patch of size $64\times64$ pixels. Fourth column: Output image of a network trained fully supervised.
  • Figure 4.3: Relationship between the size of the supervision set $B$ and the mean squared error (MSE). A reduction in $B$ leads only to a small increase in error until a supervision set size of $4\times4$ pixels. The reported values represent the mean across $5$ different test images.
  • Figure 4.4: Mean squared error (MSE) as a function of the number of training images for a constant total number of supervised pixels. The reported values represent the mean across $5$ different test images.
  • Figure 4.5: Evaluation against patch-wise image restoration. Rows 2 and 2 are zoomed in versions of the images in the rows 1 and 3. First column: Image of measured pixels only with size $64\times64$ pixels. Second column: Output image of a network trained supervised on patches of size $8\times 8$. Third column: Output image of a network trained with a supervision patch of size $8\times8$ pixels. Fourth column: Ground truth image of size $128\times128$.
  • ...and 1 more figures

Theorems & Definitions (8)

  • Proposition 2.1: Recovery guarantee for SSDU
  • Example 3.3: Translation invariance
  • Lemma 3.4
  • proof
  • Theorem 3.7: Locally-supervised global image restoration
  • proof
  • Corollary 3.10: Locally-supervised global image restoration
  • proof