Table of Contents
Fetching ...

Semantics-Aware Attention Guidance for Diagnosing Whole Slide Images

Kechun Liu, Wenjun Wu, Joann G. Elmore, Linda G. Shapiro

TL;DR

Semantics-Aware Attention Guidance is introduced, which includes a technique for converting diagnostically relevant entities into attention signals, and a flexible attention loss that efficiently integrates various semantically significant information, such as tissue anatomy and cancerous regions.

Abstract

Accurate cancer diagnosis remains a critical challenge in digital pathology, largely due to the gigapixel size and complex spatial relationships present in whole slide images. Traditional multiple instance learning (MIL) methods often struggle with these intricacies, especially in preserving the necessary context for accurate diagnosis. In response, we introduce a novel framework named Semantics-Aware Attention Guidance (SAG), which includes 1) a technique for converting diagnostically relevant entities into attention signals, and 2) a flexible attention loss that efficiently integrates various semantically significant information, such as tissue anatomy and cancerous regions. Our experiments on two distinct cancer datasets demonstrate consistent improvements in accuracy, precision, and recall with two state-of-the-art baseline models. Qualitative analysis further reveals that the incorporation of heuristic guidance enables the model to focus on regions critical for diagnosis. SAG is not only effective for the models discussed here, but its adaptability extends to any attention-based diagnostic model. This opens up exciting possibilities for further improving the accuracy and efficiency of cancer diagnostics.

Semantics-Aware Attention Guidance for Diagnosing Whole Slide Images

TL;DR

Semantics-Aware Attention Guidance is introduced, which includes a technique for converting diagnostically relevant entities into attention signals, and a flexible attention loss that efficiently integrates various semantically significant information, such as tissue anatomy and cancerous regions.

Abstract

Accurate cancer diagnosis remains a critical challenge in digital pathology, largely due to the gigapixel size and complex spatial relationships present in whole slide images. Traditional multiple instance learning (MIL) methods often struggle with these intricacies, especially in preserving the necessary context for accurate diagnosis. In response, we introduce a novel framework named Semantics-Aware Attention Guidance (SAG), which includes 1) a technique for converting diagnostically relevant entities into attention signals, and 2) a flexible attention loss that efficiently integrates various semantically significant information, such as tissue anatomy and cancerous regions. Our experiments on two distinct cancer datasets demonstrate consistent improvements in accuracy, precision, and recall with two state-of-the-art baseline models. Qualitative analysis further reveals that the incorporation of heuristic guidance enables the model to focus on regions critical for diagnosis. SAG is not only effective for the models discussed here, but its adaptability extends to any attention-based diagnostic model. This opens up exciting possibilities for further improving the accuracy and efficiency of cancer diagnostics.
Paper Structure (11 sections, 6 equations, 4 figures, 1 table)

This paper contains 11 sections, 6 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: Visualization of the baseline model's (ScAtNet wu2021scale) attention on (a) skin biopsy WSIs in the melanoma dataset and (b) breast biopsy WSIs in the Camelyon16 dataset. Green boxes show examples of the baseline model mistakenly focusing on background regions. The signal and attention values are normalized for visualization purposes.
  • Figure 2: Overview of the SAG approach for improving WSIs diagnosis models. First, a high-resolution histopathological image is divided into $p$ number of non-overlapping patches. Then, patch embeddings are obtained using an off-the-shelf feature extractor f. Subsequently, a diagnostic network utilizes the $p \times e$-dimensional feature map for classification into distinct categories. During training, heuristic guidance ($\mathbf{HG}$) and tissue guidance ($\mathbf{TG}$) are leveraged to supervise the attention within the diagnosis model, ensuring the focus on diagnostically relevant regions.
  • Figure 3: Generation of attention guidance: (a) H&E sample image. (b) Tissue segmentation mask. (c) $\mathbf{HG}$ and $\mathbf{TG}$. The values are normalized for visualization purpose. (d) Cellular entities detected (zoom-in for best view). (e) Convex hull of cellular clusters. (f) A zoomed-in view of the red boxes in (d) and (e). The convex hull is rendered with red color.
  • Figure 4: Comparative visualizations of $\mathbf{HG}$ and the models' attention under SAG's training on the melanoma and Camelyon16 datasets. The images are sampled from test set. The $\mathbf{HG}$ and attention values are normalized for visualization purpose.