Table of Contents
Fetching ...

Exploiting Structural Consistency of Chest Anatomy for Unsupervised Anomaly Detection in Radiography Images

Tiange Xiang, Yixiao Zhang, Yongyi Lu, Alan Yuille, Chaoyi Zhang, Weidong Cai, Zongwei Zhou

TL;DR

This work tackles unsupervised anomaly detection in chest radiographs by exploiting anatomical regularities from standardized imaging. It introduces SimSID, a framework that combines a space-aware memory matrix, hierarchical memory, and a feature-level in-painting block within a teacher–student generator paradigm, guided by a discriminator. The approach yields significant AUC gains on ZhangLab, COVIDx, and CheXpert, and proves robust to abnormal data in training, illustrating practical applicability for annotation-free radiography analysis. By focusing on semantic reconstruction of normal anatomy rather than pixel-level fidelity, SimSID achieves strong performance while offering faster training and inference relative to prior memory-based methods, with potential impact on radiology workflows and automated screening.

Abstract

Radiography imaging protocols focus on particular body regions, therefore producing images of great similarity and yielding recurrent anatomical structures across patients. Exploiting this structured information could potentially ease the detection of anomalies from radiography images. To this end, we propose a Simple Space-Aware Memory Matrix for In-painting and Detecting anomalies from radiography images (abbreviated as SimSID). We formulate anomaly detection as an image reconstruction task, consisting of a space-aware memory matrix and an in-painting block in the feature space. During the training, SimSID can taxonomize the ingrained anatomical structures into recurrent visual patterns, and in the inference, it can identify anomalies (unseen/modified visual patterns) from the test image. Our SimSID surpasses the state of the arts in unsupervised anomaly detection by +8.0%, +5.0%, and +9.9% AUC scores on ZhangLab, COVIDx, and CheXpert benchmark datasets, respectively. Code: https://github.com/MrGiovanni/SimSID

Exploiting Structural Consistency of Chest Anatomy for Unsupervised Anomaly Detection in Radiography Images

TL;DR

This work tackles unsupervised anomaly detection in chest radiographs by exploiting anatomical regularities from standardized imaging. It introduces SimSID, a framework that combines a space-aware memory matrix, hierarchical memory, and a feature-level in-painting block within a teacher–student generator paradigm, guided by a discriminator. The approach yields significant AUC gains on ZhangLab, COVIDx, and CheXpert, and proves robust to abnormal data in training, illustrating practical applicability for annotation-free radiography analysis. By focusing on semantic reconstruction of normal anatomy rather than pixel-level fidelity, SimSID achieves strong performance while offering faster training and inference relative to prior memory-based methods, with potential impact on radiology workflows and automated screening.

Abstract

Radiography imaging protocols focus on particular body regions, therefore producing images of great similarity and yielding recurrent anatomical structures across patients. Exploiting this structured information could potentially ease the detection of anomalies from radiography images. To this end, we propose a Simple Space-Aware Memory Matrix for In-painting and Detecting anomalies from radiography images (abbreviated as SimSID). We formulate anomaly detection as an image reconstruction task, consisting of a space-aware memory matrix and an in-painting block in the feature space. During the training, SimSID can taxonomize the ingrained anatomical structures into recurrent visual patterns, and in the inference, it can identify anomalies (unseen/modified visual patterns) from the test image. Our SimSID surpasses the state of the arts in unsupervised anomaly detection by +8.0%, +5.0%, and +9.9% AUC scores on ZhangLab, COVIDx, and CheXpert benchmark datasets, respectively. Code: https://github.com/MrGiovanni/SimSID
Paper Structure (19 sections, 6 equations, 9 figures, 2 tables)

This paper contains 19 sections, 6 equations, 9 figures, 2 tables.

Figures (9)

  • Figure 1: Anomaly detection in radiography images can be both easier and harder than photographic images. It is easier because radiography images are spatially structured due to consistent imaging protocols. It is harder because anomalies are subtle and require medical expertise to annotate. We contribute a novel anomaly detection method ( SimSID) that directly exploits the structured information in radiography images.
  • Figure 2: SimSID overview. We divide an input image into $N\times N$ non-overlapping patches and feed them into the encoder for feature extraction. Two generators will be trained to reconstruct the original image. Along with the reconstruction, a dictionary of anatomical patterns will be created and updated dynamically via a novel space-aware memory matrix (§\ref{['sec:queue']}); The teacher generator directly uses the features extracted by the encoder; the student generator uses the features augmented by a new feature in-painting block (§\ref{['sec:inpaint']}). The teacher and student generators are coupled through a knowledge distillation paradigm. We employ a discriminator to assess whether the image reconstructed by the student generator is real or fake. Once trained, the discriminator can be used to detect anomalies in test images (§\ref{['sec:alert']}).
  • Figure 3: Space-aware memory. For unique encoding of location information, we restrict each patch to be only able to access a set of specific tokens in the memory.
  • Figure 4: SimSID architecture. Our SimSID consists of an encoder, a student generator, a teacher generator, and a discriminator. All of the network architectures are built with plain convolution, batch normalization, and ReLU activation layers. Given an input image, we first divide it into non-overlapping patches. The encoder then extracts the patch features. The student and teacher generators were constructed identically. The only difference is that additional Memory Matrices are placed in the student generator. The discriminator was constructed in a more lightweight style. Note that the images are discriminated at their full resolution rather than in patches.
  • Figure 5: Two-step workflow of the in-painting block. (a) Each non-overlapping patch feature $\mathbf{z}$ is queried to an unique region in Memory Matrix, the most similar items are assembled to $\mathbf{\hat{z}}$. (b) Each center patch feature $\mathbf{z}$ and its eight neighbors $\mathbf{\hat{z}}$ are used as query and key/value respectively to a Transformer layer for in-painting. During training, the Memory Matrix is updated through optimization via backpropagation.
  • ...and 4 more figures