Table of Contents
Fetching ...

Content-Based Image Retrieval for Multi-Class Volumetric Radiology Images: A Benchmark Study

Farnaz Khun Jush, Steffen Vogler, Tuan Truong, Matthias Lenga

TL;DR

This work addresses the challenge of content-based image retrieval for 3D volumetric radiology images by introducing a benchmark based on the TotalSegmentator dataset, enabling region-based and localized multi-organ retrieval. It combines a scalable vector-indexing pipeline with diverse 2D slice embeddings and a ColBERT-inspired late interaction re-ranking to improve volumetric recalls. Across 29 coarse and 104 fine anatomical structures, pre-trained embeddings from self-supervised sources and DreamSim-based ensembles achieve high recall, with region-based and localized retrieval approaching near-perfect accuracy in many cases, and re-ranking providing notable gains. The study demonstrates the feasibility and utility of a standardized CBIR benchmark for medical imaging, highlights the value of re-ranking for context-aware search, and offers guidance on embedding choices and evaluation metrics for real-world clinical retrieval tasks.

Abstract

While content-based image retrieval (CBIR) has been extensively studied in natural image retrieval, its application to medical images presents ongoing challenges, primarily due to the 3D nature of medical images. Recent studies have shown the potential use of pre-trained vision embeddings for CBIR in the context of radiology image retrieval. However, a benchmark for the retrieval of 3D volumetric medical images is still lacking, hindering the ability to objectively evaluate and compare the efficiency of proposed CBIR approaches in medical imaging. In this study, we extend previous work and establish a benchmark for region-based and localized multi-organ retrieval using the TotalSegmentator dataset (TS) with detailed multi-organ annotations. We benchmark embeddings derived from pre-trained supervised models on medical images against embeddings derived from pre-trained unsupervised models on non-medical images for 29 coarse and 104 detailed anatomical structures in volume and region levels. For volumetric image retrieval, we adopt a late interaction re-ranking method inspired by text matching. We compare it against the original method proposed for volume and region retrieval and achieve a retrieval recall of 1.0 for diverse anatomical regions with a wide size range. The findings and methodologies presented in this paper provide insights and benchmarks for further development and evaluation of CBIR approaches in the context of medical imaging.

Content-Based Image Retrieval for Multi-Class Volumetric Radiology Images: A Benchmark Study

TL;DR

This work addresses the challenge of content-based image retrieval for 3D volumetric radiology images by introducing a benchmark based on the TotalSegmentator dataset, enabling region-based and localized multi-organ retrieval. It combines a scalable vector-indexing pipeline with diverse 2D slice embeddings and a ColBERT-inspired late interaction re-ranking to improve volumetric recalls. Across 29 coarse and 104 fine anatomical structures, pre-trained embeddings from self-supervised sources and DreamSim-based ensembles achieve high recall, with region-based and localized retrieval approaching near-perfect accuracy in many cases, and re-ranking providing notable gains. The study demonstrates the feasibility and utility of a standardized CBIR benchmark for medical imaging, highlights the value of re-ranking for context-aware search, and offers guidance on embedding choices and evaluation metrics for real-world clinical retrieval tasks.

Abstract

While content-based image retrieval (CBIR) has been extensively studied in natural image retrieval, its application to medical images presents ongoing challenges, primarily due to the 3D nature of medical images. Recent studies have shown the potential use of pre-trained vision embeddings for CBIR in the context of radiology image retrieval. However, a benchmark for the retrieval of 3D volumetric medical images is still lacking, hindering the ability to objectively evaluate and compare the efficiency of proposed CBIR approaches in medical imaging. In this study, we extend previous work and establish a benchmark for region-based and localized multi-organ retrieval using the TotalSegmentator dataset (TS) with detailed multi-organ annotations. We benchmark embeddings derived from pre-trained supervised models on medical images against embeddings derived from pre-trained unsupervised models on non-medical images for 29 coarse and 104 detailed anatomical structures in volume and region levels. For volumetric image retrieval, we adopt a late interaction re-ranking method inspired by text matching. We compare it against the original method proposed for volume and region retrieval and achieve a retrieval recall of 1.0 for diverse anatomical regions with a wide size range. The findings and methodologies presented in this paper provide insights and benchmarks for further development and evaluation of CBIR approaches in the context of medical imaging.
Paper Structure (35 sections, 7 equations, 12 figures, 12 tables)

This paper contains 35 sections, 7 equations, 12 figures, 12 tables.

Figures (12)

  • Figure 1: Overview of a retrieval system based on jush2023medical: Step 1: 2D slices are extracted from the 3D volumes. Step 2: Feature extractors are used to extract the embeddings from the database slices and query volumes. Step 3: Database embeddings are indexed using HNSW or LSH indexing. Step 4: Search and slice retrieval is performed, and a hit-table is saved (the hit-table shows the occurrence of volume-ids per each query volume or region saved along with the sum of its total score). Step 5: The results from slice retrieval are aggregated to retrieve the final volume.
  • Figure 2: Volume-based retrieval: For a query volume $V_q$ covering a range of anatomical regions, a volume is retrieved that should cover the same anatomical regions. The similarity search is based on all slices from the query volume.
  • Figure 3: Region-based retrieval. Anatomical regions are considered individually. A sub-volume constrained to an anatomical region of interest $r$ is generated and fed to the search system to retrieve a volume containing the anatomical region. A case is considered a True Positive (TP) if the retrieved case contains the region $r$ at some location.
  • Figure 4: Localized retrieval. Anatomical regions are considered individually. A sub-volume constrained to the anatomical region of interest $r$ is generated and fed to the search system to retrieve a volume containing the same anatomical region. A case is only considered as True Positive (TP) if at least one of the slices in the retrieved volume contains the region $r$.
  • Figure 5: Overview of re-ranking. Step 1: Filtering based on at least one similar slice leads to the selection of candidate volumes Step 2: followed by similarity score computation using dot product on the normalized embedding matrices. Step 3: The final step involves max-pooling and summation to determine the top-scoring volumes for retrieval.
  • ...and 7 more figures