Table of Contents
Fetching ...

Classification of Breast Cancer Histopathology Images using a Modified Supervised Contrastive Learning Method

Matina Mahdizadeh Sani, Ali Royat, Mahdieh Soleymani Baghshah

TL;DR

The paper addresses the challenge of scarce labeled data in breast cancer histopathology image classification by introducing a two-stage learning framework that fuses self-supervised pre-training with a modified supervised contrastive loss and a relaxing pairing mechanism, aided by histopathology-specific augmentations and an EfficientNet-B2 backbone. The approach combines an auxiliary stain-robust task with a fine-tuning stage to produce robust, color-invariant representations, achieving a 93.63% image-level accuracy on BreakHis and a 1.45% average improvement over state-of-the-art methods. Ablation studies indicate the supervised contrastive component as the primary performance driver, with the relaxation and self-supervised terms providing additional gains, while domain augmentations enhance generalization across magnifications. The method generalizes to the BACH dataset and demonstrates potential for improved clinical decision support by stabilizing performance across staining variations and image magnifications.

Abstract

Deep neural networks have reached remarkable achievements in medical image processing tasks, specifically in classifying and detecting various diseases. However, when confronted with limited data, these networks face a critical vulnerability, often succumbing to overfitting by excessively memorizing the limited information available. This work addresses the challenge mentioned above by improving the supervised contrastive learning method leveraging both image-level labels and domain-specific augmentations to enhance model robustness. This approach integrates self-supervised pre-training with a two-stage supervised contrastive learning strategy. In the first stage, we employ a modified supervised contrastive loss that not only focuses on reducing false negatives but also introduces an elimination effect to address false positives. In the second stage, a relaxing mechanism is introduced that refines positive and negative pairs based on similarity, ensuring that only relevant image representations are aligned. We evaluate our method on the BreakHis dataset, which consists of breast cancer histopathology images, and demonstrate an increase in classification accuracy by 1.45% in the image level, compared to the state-of-the-art method. This improvement corresponds to 93.63% absolute accuracy, highlighting the effectiveness of our approach in leveraging properties of data to learn more appropriate representation space.

Classification of Breast Cancer Histopathology Images using a Modified Supervised Contrastive Learning Method

TL;DR

The paper addresses the challenge of scarce labeled data in breast cancer histopathology image classification by introducing a two-stage learning framework that fuses self-supervised pre-training with a modified supervised contrastive loss and a relaxing pairing mechanism, aided by histopathology-specific augmentations and an EfficientNet-B2 backbone. The approach combines an auxiliary stain-robust task with a fine-tuning stage to produce robust, color-invariant representations, achieving a 93.63% image-level accuracy on BreakHis and a 1.45% average improvement over state-of-the-art methods. Ablation studies indicate the supervised contrastive component as the primary performance driver, with the relaxation and self-supervised terms providing additional gains, while domain augmentations enhance generalization across magnifications. The method generalizes to the BACH dataset and demonstrates potential for improved clinical decision support by stabilizing performance across staining variations and image magnifications.

Abstract

Deep neural networks have reached remarkable achievements in medical image processing tasks, specifically in classifying and detecting various diseases. However, when confronted with limited data, these networks face a critical vulnerability, often succumbing to overfitting by excessively memorizing the limited information available. This work addresses the challenge mentioned above by improving the supervised contrastive learning method leveraging both image-level labels and domain-specific augmentations to enhance model robustness. This approach integrates self-supervised pre-training with a two-stage supervised contrastive learning strategy. In the first stage, we employ a modified supervised contrastive loss that not only focuses on reducing false negatives but also introduces an elimination effect to address false positives. In the second stage, a relaxing mechanism is introduced that refines positive and negative pairs based on similarity, ensuring that only relevant image representations are aligned. We evaluate our method on the BreakHis dataset, which consists of breast cancer histopathology images, and demonstrate an increase in classification accuracy by 1.45% in the image level, compared to the state-of-the-art method. This improvement corresponds to 93.63% absolute accuracy, highlighting the effectiveness of our approach in leveraging properties of data to learn more appropriate representation space.
Paper Structure (23 sections, 10 equations, 3 figures, 4 tables, 1 algorithm)

This paper contains 23 sections, 10 equations, 3 figures, 4 tables, 1 algorithm.

Figures (3)

  • Figure 1: Data augmentation strategies. In this figure, an illustration of the various data augmentation methods used in the research is provided.
  • Figure 2: Positive pairs. This figure depicts an anchor image alongside one of its corresponding positive pairs. This pair is created through a combination of data augmentations introduced in figure \ref{['augs']}, and each augmentation is applied with a specific probability. The first row indicates the anchor image, while the second one is donated to the corresponding positive pair.
  • Figure 3: Network architecture. The architecture consists of three training stages, starting with the weights of EfficientNet pre-trained on histopathology images. The pre-training phase involves training on the Breakhis dataset using the modified supervised contrastive learning method. Subsequently, the similarity between the image representations is calculated to correct the positive pairs, followed by a retraining with the updated set of pairs in the relaxing phase. The final stage is the fine-tuning witch involves an auxiliary and classification task using a supervised approach. Here, the representation of the anchor image is denoted as $z_i$, while $z_p$, and $z_q$ are the representation of a positive pair and a negative pair respectively. $\hat{y}$ and $\hat{W} = (\hat{w_1}, \hat{w_2}, \hat{w_3}, \hat{w_4}, \hat{w_5}, \hat{w_6})$ are the predicted label, and conversion matrix with respect to the input image.