Table of Contents
Fetching ...

Fast Large-Scale Model-Based Iterative Tomography via Exploiting Mathematical Structure, Hierarchical Optimization, Smart Initialization, and Distributed GPU Computing

Dinesh Kumar, Jeffrey Donatelli

Abstract

Model-Based Iterative Reconstruction (MBIR) is important because direct methods, such as Filtered Back-Projection (FBP) can introduce significant noise and artifacts in sparse-angle tomography, especially for time-evolving samples. Although MBIR produces high-quality reconstructions through prior-informed optimization, its computational cost has traditionally limited its broader adoption. In previous work, we addressed this limitation by expressing the Radon transform and its adjoint using non-uniform fast Fourier transforms (NUFFTs), reducing computational complexity relative to conventional projection-based methods. We further accelerated computation by employing a multi-GPU system for parallel processing. In this work, we further accelerate our Fourier-domain framework, by introducing four main strategies: (1) a reformulation of the MBIR forward and adjoint operators that exploits their multi-level Toeplitz structure for efficient Fourier-domain computation; (2) an improved initialization strategy that uses back-projected data filtered with a standard ramp filter as the starting estimate; (3) a hierarchical multi-resolution reconstruction approach that first solves the problem on coarse grids and progressively transitions to finer grids using Lanczos interpolation; and (4) a distributed-memory implementation using MPI that enables near-linear scaling on large high-performance computing (HPC) systems. Together, these innovations significantly reduce iteration counts, improve parallel efficiency, and make high-quality MBIR reconstruction practical for large-scale tomographic imaging. These advances open the door to near-real-time MBIR for applications such as in situ, in operando, and time-evolving experiments.

Fast Large-Scale Model-Based Iterative Tomography via Exploiting Mathematical Structure, Hierarchical Optimization, Smart Initialization, and Distributed GPU Computing

Abstract

Model-Based Iterative Reconstruction (MBIR) is important because direct methods, such as Filtered Back-Projection (FBP) can introduce significant noise and artifacts in sparse-angle tomography, especially for time-evolving samples. Although MBIR produces high-quality reconstructions through prior-informed optimization, its computational cost has traditionally limited its broader adoption. In previous work, we addressed this limitation by expressing the Radon transform and its adjoint using non-uniform fast Fourier transforms (NUFFTs), reducing computational complexity relative to conventional projection-based methods. We further accelerated computation by employing a multi-GPU system for parallel processing. In this work, we further accelerate our Fourier-domain framework, by introducing four main strategies: (1) a reformulation of the MBIR forward and adjoint operators that exploits their multi-level Toeplitz structure for efficient Fourier-domain computation; (2) an improved initialization strategy that uses back-projected data filtered with a standard ramp filter as the starting estimate; (3) a hierarchical multi-resolution reconstruction approach that first solves the problem on coarse grids and progressively transitions to finer grids using Lanczos interpolation; and (4) a distributed-memory implementation using MPI that enables near-linear scaling on large high-performance computing (HPC) systems. Together, these innovations significantly reduce iteration counts, improve parallel efficiency, and make high-quality MBIR reconstruction practical for large-scale tomographic imaging. These advances open the door to near-real-time MBIR for applications such as in situ, in operando, and time-evolving experiments.

Paper Structure

This paper contains 13 sections, 16 equations, 5 figures.

Figures (5)

  • Figure 1: A Nano-CT projection data from the Tomobank was used for benchmarking. (a) A projection image from the tomogram. (b) A slice of reconstructed volume using MBIR.
  • Figure 2: Runtime comparison between the direct method (A) and the Toeplitz-based method (B) for gradient and loss, as well as the relative difference (C) between the two methods, as a function of image width. The optimized Toeplitz implementation consistently loss, as well as the relative difference (C) between the two methods, as a function of image width. The optimized Toeplitz implementation consistently reduces computation time while maintaining high numerical accuracy. The left y-axis represents runtime, and the right y-axis represents the relative difference between the direct and Toeplitz-based methods.
  • Figure 3: The initial residual is approximately a factor of two lower compared to constant initialization, effectively reducing the optimization effort by 20–25 iterations.
  • Figure 4: Convergence behavior of the multi-resolution reconstruction strategy. The coarse-to-fine approach reduces computational cost at the highest resolution by providing improved initial estimates at each refinement stage. In this example, three hierarchical grid levels are used, starting from a grid of size ($N /4 \times N/4$), doubling the resolution at each stage. Numbers next to arrows indicate iterations at each resolution.
  • Figure 5: Performance of the distributed MPI implementation on the National Energy Research Scientific Computing Center's Perlmutter supercomputer for a reconstruction volume of size $2048 \times 2447 \times 2447$.