Table of Contents
Fetching ...

Dual-view X-ray Detection: Can AI Detect Prohibited Items from Dual-view X-ray Images like Humans?

Renshuai Tao, Haoyu Wang, Yuzhe Guo, Hairong Chen, Li Zhang, Xianglong Liu, Yunchao Wei, Yao Zhao

TL;DR

The Auxiliary-view Enhanced Network (AENet), a novel detection framework that leverages both the main and auxiliary views of the same object, is proposed, which exhibits strong generalization across seven different detection models for X-ray Inspection.

Abstract

To detect prohibited items in challenging categories, human inspectors typically rely on images from two distinct views (vertical and side). Can AI detect prohibited items from dual-view X-ray images in the same way humans do? Existing X-ray datasets often suffer from limitations, such as single-view imaging or insufficient sample diversity. To address these gaps, we introduce the Large-scale Dual-view X-ray (LDXray), which consists of 353,646 instances across 12 categories, providing a diverse and comprehensive resource for training and evaluating models. To emulate human intelligence in dual-view detection, we propose the Auxiliary-view Enhanced Network (AENet), a novel detection framework that leverages both the main and auxiliary views of the same object. The main-view pipeline focuses on detecting common categories, while the auxiliary-view pipeline handles more challenging categories using ``expert models" learned from the main view. Extensive experiments on the LDXray dataset demonstrate that the dual-view mechanism significantly enhances detection performance, e.g., achieving improvements of up to 24.7% for the challenging category of umbrellas. Furthermore, our results show that AENet exhibits strong generalization across seven different detection models for X-ray Inspection

Dual-view X-ray Detection: Can AI Detect Prohibited Items from Dual-view X-ray Images like Humans?

TL;DR

The Auxiliary-view Enhanced Network (AENet), a novel detection framework that leverages both the main and auxiliary views of the same object, is proposed, which exhibits strong generalization across seven different detection models for X-ray Inspection.

Abstract

To detect prohibited items in challenging categories, human inspectors typically rely on images from two distinct views (vertical and side). Can AI detect prohibited items from dual-view X-ray images in the same way humans do? Existing X-ray datasets often suffer from limitations, such as single-view imaging or insufficient sample diversity. To address these gaps, we introduce the Large-scale Dual-view X-ray (LDXray), which consists of 353,646 instances across 12 categories, providing a diverse and comprehensive resource for training and evaluating models. To emulate human intelligence in dual-view detection, we propose the Auxiliary-view Enhanced Network (AENet), a novel detection framework that leverages both the main and auxiliary views of the same object. The main-view pipeline focuses on detecting common categories, while the auxiliary-view pipeline handles more challenging categories using ``expert models" learned from the main view. Extensive experiments on the LDXray dataset demonstrate that the dual-view mechanism significantly enhances detection performance, e.g., achieving improvements of up to 24.7% for the challenging category of umbrellas. Furthermore, our results show that AENet exhibits strong generalization across seven different detection models for X-ray Inspection

Paper Structure

This paper contains 20 sections, 14 equations, 6 figures, 6 tables, 1 algorithm.

Figures (6)

  • Figure 1: Illustration of the dual-view X-ray detection task. This paper aims to develop AI capable of reasoning like human experts when detecting prohibited items, leveraging insights from dual-view X-ray images to improve accuracy and interpretative ability.
  • Figure 2: Various categories of objects and corresponding X-ray images from two different views in LDXray. The top row displays natural images of the objects. The middle row shows the X-ray images taken from the vertical view, while the bottom row shows the side view.
  • Figure 3: Various distributions. (a) Instance per Category Distribution. (b) Instance per Image Distribution. (c) Instance Area Density Distribution. (d) Image Area Density Distribution.
  • Figure 4: Image quality comparison of current X-ray datasets. The images of LDXray are superior, attributed to more advanced imaging techniques, enhanced color vibrancy, and higher resolution.
  • Figure 5: Overview of the AENet framework. This architecture consists of two distinct pipelines: the main view, which follows a traditional object detection paradigm, and the auxiliary view, which uses saliency detection and cross-view correspondence for localization.
  • ...and 1 more figures