Table of Contents
Fetching ...

Self-Supervised Denoiser Framework

Emilien Valat, Andreas Hauptmann, Ozan Öktem

TL;DR

The paper tackles the challenge of high-throughput industrial CT with undersampled sinograms by introducing Self-Supervised Denoiser Framework (SDF), which trains a sinogram-space denoiser via predicting one subset from another without ground-truth images. SDF defines a reconstruction operator and sinogram-to-sinogram mappings, enabling self-supervised learning over sinogram partitions and providing a latent image-domain autoencoding interpretation. Empirical results on 2D and 3D CT datasets show that SDF improves PSNR over analytic and other self-supervised methods, is effective as a pretraining step for supervision with limited data, and scales to CBCT with significant angular and orbital sparsity. The work demonstrates SDF’s potential as a foundational component for CT enhancement models, capable of speeding acquisition while preserving image quality, and supports its use in few-shot and cross-domain settings.

Abstract

Reconstructing images using Computed Tomography (CT) in an industrial context leads to specific challenges that differ from those encountered in other areas, such as clinical CT. Indeed, non-destructive testing with industrial CT will often involve scanning multiple similar objects while maintaining high throughput, requiring short scanning times, which is not a relevant concern in clinical CT. Under-sampling the tomographic data (sinograms) is a natural way to reduce the scanning time at the cost of image quality since the latter depends on the number of measurements. In such a scenario, post-processing techniques are required to compensate for the image artifacts induced by the sinogram sparsity. We introduce the Self-supervised Denoiser Framework (SDF), a self-supervised training method that leverages pre-training on highly sampled sinogram data to enhance the quality of images reconstructed from undersampled sinogram data. The main contribution of SDF is that it proposes to train an image denoiser in the sinogram space by setting the learning task as the prediction of one sinogram subset from another. As such, it does not require ground-truth image data, leverages the abundant data modality in CT, the sinogram, and can drastically enhance the quality of images reconstructed from a fraction of the measurements. We demonstrate that SDF produces better image quality, in terms of peak signal-to-noise ratio, than other analytical and self-supervised frameworks in both 2D fan-beam or 3D cone-beam CT settings. Moreover, we show that the enhancement provided by SDF carries over when fine-tuning the image denoiser on a few examples, making it a suitable pre-training technique in a context where there is little high-quality image data. Our results are established on experimental datasets, making SDF a strong candidate for being the building block of foundational image-enhancement models in CT.

Self-Supervised Denoiser Framework

TL;DR

The paper tackles the challenge of high-throughput industrial CT with undersampled sinograms by introducing Self-Supervised Denoiser Framework (SDF), which trains a sinogram-space denoiser via predicting one subset from another without ground-truth images. SDF defines a reconstruction operator and sinogram-to-sinogram mappings, enabling self-supervised learning over sinogram partitions and providing a latent image-domain autoencoding interpretation. Empirical results on 2D and 3D CT datasets show that SDF improves PSNR over analytic and other self-supervised methods, is effective as a pretraining step for supervision with limited data, and scales to CBCT with significant angular and orbital sparsity. The work demonstrates SDF’s potential as a foundational component for CT enhancement models, capable of speeding acquisition while preserving image quality, and supports its use in few-shot and cross-domain settings.

Abstract

Reconstructing images using Computed Tomography (CT) in an industrial context leads to specific challenges that differ from those encountered in other areas, such as clinical CT. Indeed, non-destructive testing with industrial CT will often involve scanning multiple similar objects while maintaining high throughput, requiring short scanning times, which is not a relevant concern in clinical CT. Under-sampling the tomographic data (sinograms) is a natural way to reduce the scanning time at the cost of image quality since the latter depends on the number of measurements. In such a scenario, post-processing techniques are required to compensate for the image artifacts induced by the sinogram sparsity. We introduce the Self-supervised Denoiser Framework (SDF), a self-supervised training method that leverages pre-training on highly sampled sinogram data to enhance the quality of images reconstructed from undersampled sinogram data. The main contribution of SDF is that it proposes to train an image denoiser in the sinogram space by setting the learning task as the prediction of one sinogram subset from another. As such, it does not require ground-truth image data, leverages the abundant data modality in CT, the sinogram, and can drastically enhance the quality of images reconstructed from a fraction of the measurements. We demonstrate that SDF produces better image quality, in terms of peak signal-to-noise ratio, than other analytical and self-supervised frameworks in both 2D fan-beam or 3D cone-beam CT settings. Moreover, we show that the enhancement provided by SDF carries over when fine-tuning the image denoiser on a few examples, making it a suitable pre-training technique in a context where there is little high-quality image data. Our results are established on experimental datasets, making SDF a strong candidate for being the building block of foundational image-enhancement models in CT.

Paper Structure

This paper contains 17 sections, 11 equations, 16 figures, 4 tables.

Figures (16)

  • Figure 1: The blue nodes represent sinogram-space data, and the orange nodes represent image-space data. The composition of $\operatorname{\mathcal{A}}^{\dagger}_i$ with $\Lambda_{\theta}$ (denoted by $\operatorname{\mathcal{E}}_i$) and $A_j$ (denoted $\operatorname{\mathcal{D}}_j$) can be seen as encoding and decoding parts of a sinogram autoencoder that has $X$ (set of images) as its latent space.
  • Figure 2: Slices of the same object reconstructed from mode1 (\ref{['fig:mode1_slice']}) and mode2 (\ref{['fig:mode2_slice']}) sinograms.
  • Figure 3: Central slice of the same volume reconstructed from FDK ran on orbit 1 (\ref{['fig:fdk_orbit1']}), and AG ran on all orbits (\ref{['fig:ag_all_orbits']}).
  • Figure 4: Architecture of the image denoiser $\Lambda_{\theta}$ used in all our experiments. $C(n,m)$ denotes a convolutional layer with $n$ input layers and $m$ output layers, a kernel size of 5 and a padding of 2, with LeakyReLU activation. For 3D CT, $n_{filters} = 8$ and for 2D CT, $n_{filters} = 32$.
  • Figure 5: Close-up comparison of the same slice reconstructed from 240 measurements by SDF (top) and Noise2Inverse (bottom).
  • ...and 11 more figures