Table of Contents
Fetching ...

Efficient Classification of Histopathology Images

Mohammad Iqbal Nouyed, Mary-Anne Hartley, Gianfranco Doretto, Donald A. Adjeroh

TL;DR

This work addresses how to efficiently classify challenging histopathology images, such as gigapixel whole-slide images for cancer diagnostics with image-level annotation, by taking a divide-and-conquer approach.

Abstract

This work addresses how to efficiently classify challenging histopathology images, such as gigapixel whole-slide images for cancer diagnostics with image-level annotation. We use images with annotated tumor regions to identify a set of tumor patches and a set of benign patches in a cancerous slide. Due to the variable nature of region of interest the tumor positive regions may refer to an extreme minority of the pixels. This creates an important problem during patch-level classification, where the majority of patches from an image labeled as 'cancerous' are actually tumor-free. This problem is different from semantic segmentation which associates a label to every pixel in an image, because after patch extraction we are only dealing with patch-level labels.Most existing approaches address the data imbalance issue by mitigating the data shortage in minority classes in order to prevent the model from being dominated by the majority classes. These methods include data re-sampling, loss re-weighting, margin modification, and data augmentation. In this work, we mitigate the patch-level class imbalance problem by taking a divide-and-conquer approach. First, we partition the data into sub-groups, and define three separate classification sub-problems based on these data partitions. Then, using an information-theoretic cluster-based sampling of deep image patch features, we sample discriminative patches from the sub-groups. Using these sampled patches, we build corresponding deep models to solve the new classification sub-problems. Finally, we integrate information learned from the respective models to make a final decision on the patches. Our result shows that the proposed approach can perform competitively using a very low percentage of the available patches in a given whole-slide image.

Efficient Classification of Histopathology Images

TL;DR

This work addresses how to efficiently classify challenging histopathology images, such as gigapixel whole-slide images for cancer diagnostics with image-level annotation, by taking a divide-and-conquer approach.

Abstract

This work addresses how to efficiently classify challenging histopathology images, such as gigapixel whole-slide images for cancer diagnostics with image-level annotation. We use images with annotated tumor regions to identify a set of tumor patches and a set of benign patches in a cancerous slide. Due to the variable nature of region of interest the tumor positive regions may refer to an extreme minority of the pixels. This creates an important problem during patch-level classification, where the majority of patches from an image labeled as 'cancerous' are actually tumor-free. This problem is different from semantic segmentation which associates a label to every pixel in an image, because after patch extraction we are only dealing with patch-level labels.Most existing approaches address the data imbalance issue by mitigating the data shortage in minority classes in order to prevent the model from being dominated by the majority classes. These methods include data re-sampling, loss re-weighting, margin modification, and data augmentation. In this work, we mitigate the patch-level class imbalance problem by taking a divide-and-conquer approach. First, we partition the data into sub-groups, and define three separate classification sub-problems based on these data partitions. Then, using an information-theoretic cluster-based sampling of deep image patch features, we sample discriminative patches from the sub-groups. Using these sampled patches, we build corresponding deep models to solve the new classification sub-problems. Finally, we integrate information learned from the respective models to make a final decision on the patches. Our result shows that the proposed approach can perform competitively using a very low percentage of the available patches in a given whole-slide image.
Paper Structure (17 sections, 7 equations, 2 figures, 5 tables, 2 algorithms)

This paper contains 17 sections, 7 equations, 2 figures, 5 tables, 2 algorithms.

Figures (2)

  • Figure 1: Overview of the proposed framework. At the first stage, all patches of WSIs are extracted using a pre-trained model $E$. Then based on available annotation train set data is categorized into 3 data sub-sets. Feature set $X_A, X_B, X_C$ are extracted from each corresponding set. On each set clustering $K$ is performed and then z-score based cluster sampling strategy is applied. Then 3 different models $E_{AvB}, E_{AvC}$ and $E_{Av(B+C)}$ are fine-tuned using the sampled patches $\{p_1', p_2',\dots, p_N'\}$ to train the binary classification models $E_{AvB}, E_{AvC}, E_{Av(B+C)}$. From these, the feature or aggregation information is passed to the aggregation function $\rho(.)$ for patch-level aggregation. And, these aggregated information is used for final patch-level decision fusion using the final $R$ classifier.
  • Figure 2: Sample WSI, with annotation. Zoomed in section includes annotated regions in different colors, also, the '+' signs indicates the patch $256\times256$ boundaries extracted.