Table of Contents
Fetching ...

Interlaced dynamic XCT reconstruction with spatio-temporal implicit neural representations

Mathias Boulanger, Ericmoore Jossou

TL;DR

This work tackles the ill-posed problem of time-resolved XCT under interlaced acquisition by leveraging spatio-temporal implicit neural representations (INRs) conditioned with INCODE priors. The authors formulate a dynamic reconstruction objective that combines data fidelity with spatial and temporal TV regularization and solve it efficiently through an ADMM-driven loop that decouples INR training from projection-heavy updates. Empirically, the INR-based approach outperforms a state-of-the-art method TIMBIR across varying angular sparsity, spatial complexity, and noise levels, with additional robustness arising from noise-aware data terms and explicit detector non-idealities modeling. The framework is extended toward practical deployment by modeling ring artifacts, enabling 4D axial batching for scalable, parallelizable reconstruction, and demonstrating a path toward real-time, high-resolution dynamic XCT with modular components and open data/code for reproducibility.

Abstract

In this work, we investigate the use of spatio-temporalImplicit Neural Representations (INRs) for dynamic X-ray computed tomography (XCT) reconstruction under interlaced acquisition schemes. The proposed approach combines ADMM-based optimization with INCODE, a conditioning framework incorporating prior knowledge, to enable efficient convergence. We evaluate our method under diverse acquisition scenarios, varying the severity of global undersampling, spatial complexity (quantified via spatial information), and noise levels. Across all settings, our model achieves strong performance and outperforms Time-Interlaced Model-Based Iterative Reconstruction (TIMBIR), a state-of-the-art model-based iterative method. In particular, we show that the inductive bias of the INR provides good robustness to moderate noise levels, and that introducing explicit noise modeling through a weighted least squares data fidelity term significantly improves performance in more challenging regimes. The final part of this work explores extensions toward a practical reconstruction framework. We demonstrate the modularity of our approach by explicitly modeling detector non-idealities, incorporating ring artifact correction directly within the reconstruction process. Additionally, we present a proof-of-concept 4D volumetric reconstruction by jointly optimizing over batched axial slices, an approach which opens up the possibilities for massive parallelization, a critical feature for processing large-scale datasets.

Interlaced dynamic XCT reconstruction with spatio-temporal implicit neural representations

TL;DR

This work tackles the ill-posed problem of time-resolved XCT under interlaced acquisition by leveraging spatio-temporal implicit neural representations (INRs) conditioned with INCODE priors. The authors formulate a dynamic reconstruction objective that combines data fidelity with spatial and temporal TV regularization and solve it efficiently through an ADMM-driven loop that decouples INR training from projection-heavy updates. Empirically, the INR-based approach outperforms a state-of-the-art method TIMBIR across varying angular sparsity, spatial complexity, and noise levels, with additional robustness arising from noise-aware data terms and explicit detector non-idealities modeling. The framework is extended toward practical deployment by modeling ring artifacts, enabling 4D axial batching for scalable, parallelizable reconstruction, and demonstrating a path toward real-time, high-resolution dynamic XCT with modular components and open data/code for reproducibility.

Abstract

In this work, we investigate the use of spatio-temporalImplicit Neural Representations (INRs) for dynamic X-ray computed tomography (XCT) reconstruction under interlaced acquisition schemes. The proposed approach combines ADMM-based optimization with INCODE, a conditioning framework incorporating prior knowledge, to enable efficient convergence. We evaluate our method under diverse acquisition scenarios, varying the severity of global undersampling, spatial complexity (quantified via spatial information), and noise levels. Across all settings, our model achieves strong performance and outperforms Time-Interlaced Model-Based Iterative Reconstruction (TIMBIR), a state-of-the-art model-based iterative method. In particular, we show that the inductive bias of the INR provides good robustness to moderate noise levels, and that introducing explicit noise modeling through a weighted least squares data fidelity term significantly improves performance in more challenging regimes. The final part of this work explores extensions toward a practical reconstruction framework. We demonstrate the modularity of our approach by explicitly modeling detector non-idealities, incorporating ring artifact correction directly within the reconstruction process. Additionally, we present a proof-of-concept 4D volumetric reconstruction by jointly optimizing over batched axial slices, an approach which opens up the possibilities for massive parallelization, a critical feature for processing large-scale datasets.

Paper Structure

This paper contains 26 sections, 29 equations, 8 figures, 3 tables, 1 algorithm.

Figures (8)

  • Figure 1: (a) Visual representation of interlaced acquisition for K = 4 and $N_{\theta}=16$ . In practice, it is the sample that rotates. (b) Illustration of interlaced view sampling for different values of K and $N_{\theta}=16$. From left to right: K = 1, K = 2, K = 4 and K = 16.
  • Figure 2: Network architecture. The reconstruction pipeline begins with an ADMM pass, where the input sinograms are used to compute an updated sequence of image estimates $\mathbf{X}$. These estimates are then combined with the Lagrange multipliers to form the intermediate variable $\mathbf{Z}$, which is subsequently downsampled. The INR optimization phase then begins. A coarse evaluation grid $\mathcal{G}$, perturbed with random offsets $\epsilon$, is used to sample the coordinates and evaluate the neural model's output. An illustration of the INR architecture is provided, explicitly incorporating the INCODE module. Standard neural network layers are shown in gray. At each evaluation step, the output image generated by the INR is compared to the downsampled version of $\mathbf{Z}$. The loss is computed from this comparison, along with spatial and temporal regularization terms. Once the model parameters are updated, the auxiliary variables $\mathbf{Q}$ , which represent the INR outputs and $\mathbf{U}$ dual variables, are also updated. This iterative cycle continues until convergence.
  • Figure 3: 3D Phase-field simulation of spinodal decomposition performed using a semi-implicit spectral method. (a) Initial , (b) final microstructure respectively. The simulation domain consists of a $64 \times 64 \times64$ grid, with a constant mobility M of 1, and an initial homogeneous composition c of 0.5.
  • Figure 4: Reconstruction results under varying acquisition settings. 1 | Full sampling with $N_{\theta} = 256$, 2 | Moderate undersampling with $N_{\theta} = 128$, 3 | Severe undersampling with $N_{\theta} = 64$. (a) Ground truth, (b) FBP reconstruction, (c) TIMBIR reconstruction, (d) Our method. Although $K = 16$ time frames were reconstructed, only every second frame is shown to reduce visual clutter.
  • Figure 5: Impact of spatial complexity on reconstruction performance. 1 | Case 1, 2 | Case 2. (a) Ground truth, (b) FBP reconstruction, (c) TIMBIR reconstruction, (d) Our method. Although $K = 16$ time intervals were reconstructed, we only show the first and last ones to reduce visual clutter.
  • ...and 3 more figures