Table of Contents
Fetching ...

Advancing Histopathology with Deep Learning Under Data Scarcity: A Decade in Review

Ahmad Obeid, Said Boumaraf, Anabia Sohail, Taimur Hassan, Sajid Javed, Jorge Dias, Mohammed Bennamoun, Naoufel Werghi

TL;DR

A comprehensive review of deep learning applications in histopathology, with a focus on the challenges posed by data scarcity over the past decade and the potential for future advancements in this field.

Abstract

Recent years witnessed remarkable progress in computational histopathology, largely fueled by deep learning. This brought the clinical adoption of deep learning-based tools within reach, promising significant benefits to healthcare, offering a valuable second opinion on diagnoses, streamlining complex tasks, and mitigating the risks of inconsistency and bias in clinical decisions. However, a well-known challenge is that deep learning models may contain up to billions of parameters; supervising their training effectively would require vast labeled datasets to achieve reliable generalization and noise resilience. In medical imaging, particularly histopathology, amassing such extensive labeled data collections places additional demands on clinicians and incurs higher costs, which hinders the art's progress. Addressing this challenge, researchers devised various strategies for leveraging deep learning with limited data and annotation availability. In this paper, we present a comprehensive review of deep learning applications in histopathology, with a focus on the challenges posed by data scarcity over the past decade. We systematically categorize and compare various approaches, evaluate their distinct contributions using benchmarking tables, and highlight their respective advantages and limitations. Additionally, we address gaps in existing reviews and identify underexplored research opportunities, underscoring the potential for future advancements in this field.

Advancing Histopathology with Deep Learning Under Data Scarcity: A Decade in Review

TL;DR

A comprehensive review of deep learning applications in histopathology, with a focus on the challenges posed by data scarcity over the past decade and the potential for future advancements in this field.

Abstract

Recent years witnessed remarkable progress in computational histopathology, largely fueled by deep learning. This brought the clinical adoption of deep learning-based tools within reach, promising significant benefits to healthcare, offering a valuable second opinion on diagnoses, streamlining complex tasks, and mitigating the risks of inconsistency and bias in clinical decisions. However, a well-known challenge is that deep learning models may contain up to billions of parameters; supervising their training effectively would require vast labeled datasets to achieve reliable generalization and noise resilience. In medical imaging, particularly histopathology, amassing such extensive labeled data collections places additional demands on clinicians and incurs higher costs, which hinders the art's progress. Addressing this challenge, researchers devised various strategies for leveraging deep learning with limited data and annotation availability. In this paper, we present a comprehensive review of deep learning applications in histopathology, with a focus on the challenges posed by data scarcity over the past decade. We systematically categorize and compare various approaches, evaluate their distinct contributions using benchmarking tables, and highlight their respective advantages and limitations. Additionally, we address gaps in existing reviews and identify underexplored research opportunities, underscoring the potential for future advancements in this field.

Paper Structure

This paper contains 57 sections, 4 equations, 13 figures, 7 tables.

Figures (13)

  • Figure 1: A pictorial illustrations of CPath. Upper: the typical CPath workflow, involving scanning, staining, ROI extraction, patch extraction, and collecting a bag of multiple instances. Lower: different cancer biomarkers where DL can be used. From left to right: Stroma-poor vs stroma-rich lung squamous cell carcinoma groups; High-TMB vs low-TMB regions; Low vs high tumor-infiltration; Mitotic cells at different mitotic phases; Normal vs cancerous glandular structures highlighted; Microsattelite stable vs high instability (adapted from MSI3).
  • Figure 2: Summary of categories of deep learning tools addressing data scarcity in the literature of computational histopathology. Categories are indicated with boxes. The main theme in each category is indicated with bullet points.
  • Figure 3: Search Results. (a) The hits count per search term. Elsevier and IEEE journals given as examples. (b) Progression of records raw count in several scientific databases. An exponential trend is observed in scarcity-oriented works in contrast to the linear trend in general works
  • Figure 4: Illustration of weak supervisory signals handled within the MIL setting in WSIs: In (a), the classification case involves analysis on the patch level, producing $\hat{y}_{patch}$, whereas only slide label $Y$ is given, and the real $y_{patch}$ is inaccessible or inaccurate. In (b), the segmentation task involves pixel values prediction $\hat{y}_{pixel}$, while the given annotation is rough or partial.
  • Figure 5: Chronological Trend Line of the Most Relevant MIL-based Techniques. The length of an arrow is proportional to its popularity per its category. Most classical methods are published before 2020, whereas most recent works are attention-based.
  • ...and 8 more figures