Table of Contents
Fetching ...

Fan-Beam CT Reconstruction for Unaligned Sparse-View X-ray Baggage Dataset

Shin Kim

TL;DR

This work tackles the lack of large-scale, labeled 3D data for security baggage inspection with stationary, unaligned sparse-view X-ray imaging. It introduces an end-to-end reconstruction pipeline that jointly optimizes LPB camera poses, a multi-spectral neural attenuation field, and a color-coding network to render RGB views from color-mapped multi-energy data without requiring raw multi-energy measurements. A cuboid labeling procedure based on 2D bounding boxes is employed to generate 3D labels via visual hull and rotating calipers. Experimental results on a nine-view baggage dataset demonstrate improved novel-view synthesis and robustness to calibration noise, with future work aimed at faster rendering using Gaussian splatting for real-time deployment.

Abstract

Computed Tomography (CT) is a technology that reconstructs cross-sectional images using X-ray images taken from multiple directions. In CT, hundreds of X-ray images acquired as the X-ray source and detector rotate around a central axis, are used for precise reconstruction. In security baggage inspection, X-ray imaging is also widely used; however, unlike the rotating systems in medical CT, stationary X-ray systems are more common, and publicly available reconstructed data are limited. This makes it challenging to obtain large-scale 3D labeled data and voxel representations essential for training. To address these limitations, our study presents a calibration and reconstruction method using an unaligned sparse multi-view X-ray baggage dataset, which has extensive 2D labeling. Our approach integrates multi-spectral neural attenuation field reconstruction with Linear pushbroom (LPB) camera model pose optimization, enhancing rendering consistency for novel views through color coding network. Our method aims to improve generalization within the security baggage inspection domain, where generalization is particularly challenging.

Fan-Beam CT Reconstruction for Unaligned Sparse-View X-ray Baggage Dataset

TL;DR

This work tackles the lack of large-scale, labeled 3D data for security baggage inspection with stationary, unaligned sparse-view X-ray imaging. It introduces an end-to-end reconstruction pipeline that jointly optimizes LPB camera poses, a multi-spectral neural attenuation field, and a color-coding network to render RGB views from color-mapped multi-energy data without requiring raw multi-energy measurements. A cuboid labeling procedure based on 2D bounding boxes is employed to generate 3D labels via visual hull and rotating calipers. Experimental results on a nine-view baggage dataset demonstrate improved novel-view synthesis and robustness to calibration noise, with future work aimed at faster rendering using Gaussian splatting for real-time deployment.

Abstract

Computed Tomography (CT) is a technology that reconstructs cross-sectional images using X-ray images taken from multiple directions. In CT, hundreds of X-ray images acquired as the X-ray source and detector rotate around a central axis, are used for precise reconstruction. In security baggage inspection, X-ray imaging is also widely used; however, unlike the rotating systems in medical CT, stationary X-ray systems are more common, and publicly available reconstructed data are limited. This makes it challenging to obtain large-scale 3D labeled data and voxel representations essential for training. To address these limitations, our study presents a calibration and reconstruction method using an unaligned sparse multi-view X-ray baggage dataset, which has extensive 2D labeling. Our approach integrates multi-spectral neural attenuation field reconstruction with Linear pushbroom (LPB) camera model pose optimization, enhancing rendering consistency for novel views through color coding network. Our method aims to improve generalization within the security baggage inspection domain, where generalization is particularly challenging.

Paper Structure

This paper contains 11 sections, 7 equations, 4 figures, 1 table, 1 algorithm.

Figures (4)

  • Figure 1: (a) Dual view X-ray imaging system by Smith Detectionsmithsdetection, (b) The 9-view X-ray system employed to acquire the multi-view security baggage dataset by SSTlabsstlabs, (d) A multi-energy X-ray image and a color coding function utilized to generate a color image. Image adapted from 10005308
  • Figure 2: Our overall pipeline: (a) LPB camera pose initialization from feature matching. (b) Training stage, where we jointly train multi-spectral neural attenuation field and color coding network. (c) In the post-processing stage, cuboid annotations are generated from 2D bounding box labels.
  • Figure 3: Comparison of images without and with the use of the color coding network.
  • Figure 4: (a) Visualization of the neural attenuation field at n=10, (b) rendered image trained at n=10, (c) rendered image trained at n=1, (d) Material-color mismatching observed in the ground truth image due to ambiguity arising from the viewing angle.