Table of Contents
Fetching ...

Dynamic Angle Selection in X-Ray CT: A Reinforcement Learning Approach to Optimal Stopping

Tianyuan Wang, Felix Lucka, Daniël M. Pelt, K. Joost Batenburg, Tristan van Leeuwen

TL;DR

This work addresses adaptive sparse-angle X-ray CT by embedding optimal stopping within sequential OED and reinforcement learning. It introduces a terminal policy within an Actor-Critic framework to jointly optimize informative angle selection and scan termination, using PSNR-based rewards and a per-step cost, and validates the approach from synthetic data to experimental CT data. Results show improved efficiency and robustness over baselines, highlighting both successful sim-to-real transfer and remaining gaps that motivate future realism and 3D extensions. The approach paves the way for fully adaptive, cost-aware CT scanning in industrial environments, enabling sparse-angle tomography to be practically deployed with dynamic stopping.

Abstract

In industrial X-ray Computed Tomography (CT), the need for rapid in-line inspection is critical. Sparse-angle tomography plays a significant role in this by reducing the required number of projections, thereby accelerating processing and conserving resources. Most existing methods aim to balance reconstruction quality and scanning time, typically relying on fixed scan durations. Adaptive adjustment of the number of angles is essential; for instance, more angles may be required for objects with complex geometries or noisier projections. The concept of optimal stopping, which dynamically adjusts this balance according to varying industrial needs, remains overlooked. Building on our previous work, we integrate optimal stopping into sequential Optimal Experimental Design (sOED) and Reinforcement Learning (RL). We propose a novel method for computing the policy gradient within the Actor-Critic framework, enabling the development of adaptive policies for informative angle selection and scan termination. Additionally, we investigate the gap between simulation and real-world applications in the context of the developed learning-based method. Our trained model, developed using synthetic data, demonstrates reliable performance when applied to experimental X-ray CT data. This approach enhances the flexibility of CT operations and expands the applicability of sparse-angle tomography in industrial settings.

Dynamic Angle Selection in X-Ray CT: A Reinforcement Learning Approach to Optimal Stopping

TL;DR

This work addresses adaptive sparse-angle X-ray CT by embedding optimal stopping within sequential OED and reinforcement learning. It introduces a terminal policy within an Actor-Critic framework to jointly optimize informative angle selection and scan termination, using PSNR-based rewards and a per-step cost, and validates the approach from synthetic data to experimental CT data. Results show improved efficiency and robustness over baselines, highlighting both successful sim-to-real transfer and remaining gaps that motivate future realism and 3D extensions. The approach paves the way for fully adaptive, cost-aware CT scanning in industrial environments, enabling sparse-angle tomography to be practically deployed with dynamic stopping.

Abstract

In industrial X-ray Computed Tomography (CT), the need for rapid in-line inspection is critical. Sparse-angle tomography plays a significant role in this by reducing the required number of projections, thereby accelerating processing and conserving resources. Most existing methods aim to balance reconstruction quality and scanning time, typically relying on fixed scan durations. Adaptive adjustment of the number of angles is essential; for instance, more angles may be required for objects with complex geometries or noisier projections. The concept of optimal stopping, which dynamically adjusts this balance according to varying industrial needs, remains overlooked. Building on our previous work, we integrate optimal stopping into sequential Optimal Experimental Design (sOED) and Reinforcement Learning (RL). We propose a novel method for computing the policy gradient within the Actor-Critic framework, enabling the development of adaptive policies for informative angle selection and scan termination. Additionally, we investigate the gap between simulation and real-world applications in the context of the developed learning-based method. Our trained model, developed using synthetic data, demonstrates reliable performance when applied to experimental X-ray CT data. This approach enhances the flexibility of CT operations and expands the applicability of sparse-angle tomography in industrial settings.

Paper Structure

This paper contains 28 sections, 27 equations, 11 figures, 4 tables, 2 algorithms.

Figures (11)

  • Figure 1: Triangle phantom example. (a) Forward process: noisy projections are generated by applying a random projection transform with $5\%$ Gaussian noise using parallel-beam geometry over 180 angles. Inverse process: the image is reconstructed from these noisy projections. (b) PSNR as a function of the number of angles for the triangle phantom. The number of angles increases in increments of one. The orange curve represents angles selected via exhaustive search, while the blue curve corresponds to uniformly spaced angles.
  • Figure 2: This figure illustrates how the reinforcement learning workflow maps onto the sOED framework. At the $k$th step, the action space comprises the possible values of the design parameter $\theta_{k}$. Once $\theta_{k}$ is chosen, the belief state is updated to the reconstructed estimate $\widehat{\boldsymbol{x}}_{k+1}$, inferred from all prior projections $\boldsymbol{y}(\{\theta_{1},\ldots,\theta_{k}\})$. The reward $r_{k}$, representing reconstruction accuracy, is then computed.
  • Figure 3: This figure illustrates the workflows: (a) Naive stopping: after selecting angles $\{\theta_1,\dots,\theta_{k-1}\}$ and acquiring projections $\boldsymbol{y}(\boldsymbol{\theta})$, the reconstruction $\widehat{\boldsymbol{x}}_k$ becomes the belief state. A shared encoder then outputs (i) the state-value estimate $\widehat{V}(\widehat{\boldsymbol{x}}_k;\boldsymbol{w}_v)$ and (ii) a distribution over all angles plus a terminal action $\theta_{\max}$. (b) Terminal‐policy stopping: the encoder additionally outputs a probability distribution over termination versus continuation. Another distinction is that the top branch estimates the continuation value function using $\widehat{V}_{C}(\widehat{\boldsymbol{x}}_{k}; \boldsymbol{w}_{v})$. In both workflows, a termination signal halts the process.
  • Figure 4: The figure shows nine samples from the synthetic dataset, including parallelograms, triangles, and pentagons. Each shape type is represented by three samples.
  • Figure 5: Experimental scanning setup at the FleX‑ray laboratory. The X‑ray source (left) and flat‑panel detector (right) remain fixed, while the object is positioned on a paper cup atop the rotation stage centered between them. During acquisition, the stage rotates to capture projections from multiple angles.
  • ...and 6 more figures