ContriMix: Scalable stain color augmentation for domain generalization without domain labels in digital pathology

Tan H. Nguyen; Dinkar Juyal; Jin Li; Aaditya Prakash; Shima Nofallah; Chintan Shah; Sai Chowdary Gullapally; Limin Yu; Michael Griffin; Anand Sampat; John Abel; Justin Lee; Amaro Taylor-Weiner

ContriMix: Scalable stain color augmentation for domain generalization without domain labels in digital pathology

Tan H. Nguyen, Dinkar Juyal, Jin Li, Aaditya Prakash, Shima Nofallah, Chintan Shah, Sai Chowdary Gullapally, Limin Yu, Michael Griffin, Anand Sampat, John Abel, Justin Lee, Amaro Taylor-Weiner

TL;DR

ContriMix is introduced, a novel domain label free stain color augmentation method based on DRIT++, a style-transfer method that outperforms competing methods on the Camelyon17-WILDS dataset and can be used by a trained ContriMix model to create synthetic images to improve the performance of existing classifiers.

Abstract

Differences in staining and imaging procedures can cause significant color variations in histopathology images, leading to poor generalization when deploying deep-learning models trained from a different data source. Various color augmentation methods have been proposed to generate synthetic images during training to make models more robust, eliminating the need for stain normalization during test time. Many color augmentation methods leverage domain labels to generate synthetic images. This approach causes three significant challenges to scaling such a model. Firstly, incorporating data from a new domain into deep-learning models trained on existing domain labels is not straightforward. Secondly, dependency on domain labels prevents the use of pathology images without domain labels to improve model performance. Finally, implementation of these methods becomes complicated when multiple domain labels (e.g., patient identification, medical center, etc) are associated with a single image. We introduce ContriMix, a novel domain label free stain color augmentation method based on DRIT++, a style-transfer method. Contrimix leverages sample stain color variation within a training minibatch and random mixing to extract content and attribute information from pathology images. This information can be used by a trained ContriMix model to create synthetic images to improve the performance of existing classifiers. ContriMix outperforms competing methods on the Camelyon17-WILDS dataset. Its performance is consistent across different slides in the test set while being robust to the color variation from rare substances in pathology images. We make our code and trained ContriMix models available for research use. The code for ContriMix can be found at https://gitlab.com/huutan86/contrimix

ContriMix: Scalable stain color augmentation for domain generalization without domain labels in digital pathology

TL;DR

Abstract

Paper Structure (21 sections, 2 equations, 9 figures, 7 tables)

This paper contains 21 sections, 2 equations, 9 figures, 7 tables.

Introduction
Method
Model architecture
Competing methods for color augmentation
Results and Discussion
Dataset
ContriMix training
Benchmarking results
Ablation study - Diversity of training domains
Qualitative evaluation by a board-certified pathologist
ContriMix for Multiple Instance Learning
Limitations
Conclusion
Additional ablation studies
Number of mixes
...and 6 more sections

Figures (9)

Figure 1: A) Overview of ContriMix - Content and attribute encodings are extracted, randomly mixed, and then combined to generate synthetic images without any domain labels. B) Three example content channels from input images. Different channels highlight different features in the tissue images.
Figure 1: Content channels of ContriMix for three different input images. The left most column contains the original images. The next three columns show three different content channels. ContriMix learns to encode biological information in different channels. Table \ref{['content-channels-table']} outlines the biological details in the channels.
Figure 2: A) Histopathology images from different hospitals in Camelyon-17-WILDS exhibit significant color variation. B) Performance comparison of DenseNet121 backbones trained with ContriMix augmentation, HistAuGAN 3-domains and 5-domains augmentation on 10 different test slides.
Figure 3: Six false positive patches (left) and six false negative patches (right) from HistAuGAN 3-domains correctly classified by ContriMix
Figure 3: Different examples of synthetic images generated by ContriMix. ContriMix attribute tensors learn to ignore artifacts (e.g.- marker ink, black spots) while the content tensors preserve relevant information without introducing hallucinations. Apart from artifacts, ContriMix is able to account for the presence of background pixels in the input images.
...and 4 more figures

ContriMix: Scalable stain color augmentation for domain generalization without domain labels in digital pathology

TL;DR

Abstract

ContriMix: Scalable stain color augmentation for domain generalization without domain labels in digital pathology

Authors

TL;DR

Abstract

Table of Contents

Figures (9)