Table of Contents
Fetching ...

Reconstruct Anything Model: a lightweight general model for computational imaging

Matthieu Terris, Samuel Hurault, Maxime Song, Julian Tachella

Abstract

Most existing learning-based methods for solving imaging inverse problems can be roughly divided into two classes: iterative algorithms, such as plug-and-play and diffusion methods leveraging pretrained denoisers, and unrolled architectures that are trained end-to-end for specific imaging problems. Iterative methods in the first class are computationally costly and often yield suboptimal reconstruction performance, whereas unrolled architectures are generally problem-specific and require expensive training. In this work, we propose a novel non-iterative, lightweight architecture that incorporates knowledge about the forward operator (acquisition physics and noise parameters) without relying on unrolling. Our model is trained to solve a wide range of inverse problems, such as deblurring, magnetic resonance imaging, computed tomography, inpainting, and super-resolution, and handles arbitrary image sizes and channels, such as grayscale, complex, and color data. The proposed model can be easily adapted to unseen inverse problems or datasets with a few fine-tuning steps (up to a few images) in a self-supervised way, without ground-truth references. Throughout a series of experiments, we demonstrate state-of-the-art performance from medical imaging to low-photon imaging and microscopy. Our code is available at https://github.com/matthieutrs/ram.

Reconstruct Anything Model: a lightweight general model for computational imaging

Abstract

Most existing learning-based methods for solving imaging inverse problems can be roughly divided into two classes: iterative algorithms, such as plug-and-play and diffusion methods leveraging pretrained denoisers, and unrolled architectures that are trained end-to-end for specific imaging problems. Iterative methods in the first class are computationally costly and often yield suboptimal reconstruction performance, whereas unrolled architectures are generally problem-specific and require expensive training. In this work, we propose a novel non-iterative, lightweight architecture that incorporates knowledge about the forward operator (acquisition physics and noise parameters) without relying on unrolling. Our model is trained to solve a wide range of inverse problems, such as deblurring, magnetic resonance imaging, computed tomography, inpainting, and super-resolution, and handles arbitrary image sizes and channels, such as grayscale, complex, and color data. The proposed model can be easily adapted to unseen inverse problems or datasets with a few fine-tuning steps (up to a few images) in a self-supervised way, without ground-truth references. Throughout a series of experiments, we demonstrate state-of-the-art performance from medical imaging to low-photon imaging and microscopy. Our code is available at https://github.com/matthieutrs/ram.

Paper Structure

This paper contains 54 sections, 17 equations, 22 figures, 9 tables, 2 algorithms.

Figures (22)

  • Figure 1: The proposed Reconstruct Anything Model (RAM) model can solve a wide variety of inverse problems, from medical imaging to microscopy and low-photon imaging, obtaining state-of-the-art zero-shot performance in imaging problems and datasets in or close to the training distribution. RAM can also be finetuned in a self-supervised way (without any ground-truth references) for images and/or problems strongly differing from training distribution.
  • Figure 2: Proposed architecture for solving non-blind imaging inverse problems.Top row: The architecture builds upon a DRUNet backbone, originally designed with convolutional and residual blocks, but is enhanced to integrate knowledge about the measurement operator $\boldsymbol{A}$ and measurements $\boldsymbol{y}$. Bottom row: At each scale, feature maps are decoded into the image domain, processed through a Krylov subspace module (KSM), and then re-encoded. The encoding/decoding module consists of a simple residual convolutional block. The KSM blocks concatenate power iterations of the scaled measurement operator $\boldsymbol{A}_s^\top \boldsymbol{A}_s$, enabling efficient and adaptable processing for a wide range of inverse problems.
  • Figure 3: Multiscale conditioning. Estimating the underlying image (here motion blur) is easier on a coarser grid than a fine grid, as the forward operator is more ill-posed in the latter case.
  • Figure 4: Deblurring results. Top row: motion blur hard, on a CBSD68 sample. Bottom row: Gaussian blur medium, on a DIV2K sample.
  • Figure 5: Blind deblurring. Results on a real motion blur (Kohler dataset).
  • ...and 17 more figures