Improving the Generalisation of Learned Reconstruction Frameworks
Emilien Valat, Ozan Öktem
TL;DR
The paper tackles the generalization challenge in learned CT reconstruction by reframing sinogram processing as a graph-structured problem that encodes acquisition geometry. It introduces GLM, a hybrid network that fuses grid-based CNN filtering with graph-based aggregation, enabling locality in the detector dimension while capturing geometry-dependent relationships across source positions. Empirically, GLM achieves higher quality (SSIM/PSNR) with far fewer trainable parameters and demonstrates robust generalization to unseen acquisition geometries, including sparse-view scenarios, while offering improved memory and training-time scalability. These results suggest that geometry-aware graph representations coupled with targeted aggregation can substantially improve the robustness and efficiency of learned reconstruction in CT, with potential extensions to more complex geometries and weighting schemes.
Abstract
Ensuring proper generalization is a critical challenge in applying data-driven methods for solving inverse problems in imaging, as neural networks reconstructing an image must perform well across varied datasets and acquisition geometries. In X-ray Computed Tomography (CT), convolutional neural networks (CNNs) are widely used to filter the projection data but are ill-suited for this task as they apply grid-based convolutions to the sinogram, which inherently lies on a line manifold, not a regular grid. The CNNs, unaware of the geometry, are implicitly tied to it and require an excessive amount of parameters as they must infer the relations between measurements from the data rather than from prior information. The contribution of this paper is twofold. First, we introduce a graph data structure to represent CT acquisition geometries and tomographic data, providing a detailed explanation of the graph's structure for circular, cone-beam geometries. Second, we propose GLM, a hybrid neural network architecture that leverages both graph and grid convolutions to process tomographic data. We demonstrate that GLM outperforms CNNs when performance is quantified in terms of structural similarity and peak signal-to-noise ratio, despite the fact that GLM uses only a fraction of the trainable parameters. Compared to CNNs, GLM also requires significantly less training time and memory, and its memory requirements scale better. Crucially, GLM demonstrates robust generalization to unseen variations in the acquisition geometry, like when training only on fully sampled CT data and then testing on sparse-view CT data.
