Table of Contents
Fetching ...

Conditioning Generative Latent Optimization for Sparse-View CT Image Reconstruction

Thomas Braure, Delphine Lazaro, David Hateau, Vincent Brandon, Kévin Ginsburger

TL;DR

This work introduces conditioning Generative Latent Optimization (cGLO) as an unsupervised reconstruction framework for sparse-view CT that blends DIP’s reparameterization with GLO’s latent-space optimization. cGLO reconstructs multiple slices jointly using a shared decoder and per-slice latent codes, with optional unsupervised pretraining to initialize the decoder and strengthen the prior. Across experiments on LIDC and LDCT, cGLO outperforms DIP and FBP without training data and remains competitive or superior to conditioning-based diffusion approaches when prior data are available, particularly in preserving structural integrity (SSIM). The method is flexible, requires only the forward model, and can be extended to other ill-posed IIPs or multi-task settings such as joint reconstruction and segmentation.

Abstract

Computed Tomography (CT) is a prominent example of Imaging Inverse Problem highlighting the unrivaled performances of data-driven methods in degraded measurements setups like sparse X-ray projections. Although a significant proportion of deep learning approaches benefit from large supervised datasets, they cannot generalize to new experimental setups. In contrast, fully unsupervised techniques, most notably using score-based generative models, have recently demonstrated similar or better performances compared to supervised approaches while being flexible at test time. However, their use cases are limited as they need considerable amounts of training data to have good generalization properties. Another unsupervised approach taking advantage of the implicit natural bias of deep convolutional networks, Deep Image Prior, has recently been adapted to solve sparse CT by reparameterizing the reconstruction problem. Although this methodology does not require any training dataset, it enforces a weaker prior on the reconstructions when compared to data-driven methods. To fill the gap between these two strategies, we propose an unsupervised conditional approach to the Generative Latent Optimization framework (cGLO). Similarly to DIP, without any training dataset, cGLO benefits from the structural bias of a decoder network. However, the prior is further reinforced as the effect of a likelihood objective shared between multiple slices being reconstructed simultaneously through the same decoder network. In addition, the parameters of the decoder may be initialized on an unsupervised, and eventually very small, training dataset to enhance the reconstruction. The resulting approach is tested on full-dose sparse-view CT using multiple training dataset sizes and varying numbers of viewing angles.

Conditioning Generative Latent Optimization for Sparse-View CT Image Reconstruction

TL;DR

This work introduces conditioning Generative Latent Optimization (cGLO) as an unsupervised reconstruction framework for sparse-view CT that blends DIP’s reparameterization with GLO’s latent-space optimization. cGLO reconstructs multiple slices jointly using a shared decoder and per-slice latent codes, with optional unsupervised pretraining to initialize the decoder and strengthen the prior. Across experiments on LIDC and LDCT, cGLO outperforms DIP and FBP without training data and remains competitive or superior to conditioning-based diffusion approaches when prior data are available, particularly in preserving structural integrity (SSIM). The method is flexible, requires only the forward model, and can be extended to other ill-posed IIPs or multi-task settings such as joint reconstruction and segmentation.

Abstract

Computed Tomography (CT) is a prominent example of Imaging Inverse Problem highlighting the unrivaled performances of data-driven methods in degraded measurements setups like sparse X-ray projections. Although a significant proportion of deep learning approaches benefit from large supervised datasets, they cannot generalize to new experimental setups. In contrast, fully unsupervised techniques, most notably using score-based generative models, have recently demonstrated similar or better performances compared to supervised approaches while being flexible at test time. However, their use cases are limited as they need considerable amounts of training data to have good generalization properties. Another unsupervised approach taking advantage of the implicit natural bias of deep convolutional networks, Deep Image Prior, has recently been adapted to solve sparse CT by reparameterizing the reconstruction problem. Although this methodology does not require any training dataset, it enforces a weaker prior on the reconstructions when compared to data-driven methods. To fill the gap between these two strategies, we propose an unsupervised conditional approach to the Generative Latent Optimization framework (cGLO). Similarly to DIP, without any training dataset, cGLO benefits from the structural bias of a decoder network. However, the prior is further reinforced as the effect of a likelihood objective shared between multiple slices being reconstructed simultaneously through the same decoder network. In addition, the parameters of the decoder may be initialized on an unsupervised, and eventually very small, training dataset to enhance the reconstruction. The resulting approach is tested on full-dose sparse-view CT using multiple training dataset sizes and varying numbers of viewing angles.
Paper Structure (12 sections, 20 equations, 9 figures, 5 tables)

This paper contains 12 sections, 20 equations, 9 figures, 5 tables.

Figures (9)

  • Figure 1: Generator DCGAN-like architecture, latent dimension (512) and final resolution (512$\times$512) corresponding to the LDCT sub-datasets experiments.
  • Figure 2: Examples of reconstructions given 9, 23 and 50 experimental viewing angles, obtained with FBP (upper row), DIP (middle row) and cGLO (lower row). Reconstructions are achieved without prior unsupervised training.
  • Figure 3: PSNR and SSIM median values curves corresponding to reconstructions of slices from the LIDC (a) and the LDCT (b) test sets given 9, 23 and 50 experimental viewing angles. Each column is associated with one of the 2%, 10% and 35% training sub-datasets presented in table \ref{['tab:data']}.
  • Figure 4: Examples of reconstructions given 9, 23 and 50 experimental viewing angles, obtained with cGLO (upper row), cSGM (middle row) and MCG (lower row). Methods are trained on the sub-dataset consisting of a 2% portion of the LIDC dataset.
  • Figure 5: Parallel beam geometry profile acquisition for one viewing angle
  • ...and 4 more figures