Fan-Beam CT Reconstruction for Unaligned Sparse-View X-ray Baggage Dataset
Shin Kim
TL;DR
This work tackles the lack of large-scale, labeled 3D data for security baggage inspection with stationary, unaligned sparse-view X-ray imaging. It introduces an end-to-end reconstruction pipeline that jointly optimizes LPB camera poses, a multi-spectral neural attenuation field, and a color-coding network to render RGB views from color-mapped multi-energy data without requiring raw multi-energy measurements. A cuboid labeling procedure based on 2D bounding boxes is employed to generate 3D labels via visual hull and rotating calipers. Experimental results on a nine-view baggage dataset demonstrate improved novel-view synthesis and robustness to calibration noise, with future work aimed at faster rendering using Gaussian splatting for real-time deployment.
Abstract
Computed Tomography (CT) is a technology that reconstructs cross-sectional images using X-ray images taken from multiple directions. In CT, hundreds of X-ray images acquired as the X-ray source and detector rotate around a central axis, are used for precise reconstruction. In security baggage inspection, X-ray imaging is also widely used; however, unlike the rotating systems in medical CT, stationary X-ray systems are more common, and publicly available reconstructed data are limited. This makes it challenging to obtain large-scale 3D labeled data and voxel representations essential for training. To address these limitations, our study presents a calibration and reconstruction method using an unaligned sparse multi-view X-ray baggage dataset, which has extensive 2D labeling. Our approach integrates multi-spectral neural attenuation field reconstruction with Linear pushbroom (LPB) camera model pose optimization, enhancing rendering consistency for novel views through color coding network. Our method aims to improve generalization within the security baggage inspection domain, where generalization is particularly challenging.
