Contrasting Low and High-Resolution Features for HER2 Scoring using Deep Learning

Ekansh Chauhan; Anila Sharma; Amit Sharma; Vikas Nishadham; Asha Ghughtyal; Ankur Kumar; Gurudutt Gupta; Anurag Mehta; C. V. Jawahar; P. K. Vinod

Contrasting Low and High-Resolution Features for HER2 Scoring using Deep Learning

Ekansh Chauhan, Anila Sharma, Amit Sharma, Vikas Nishadham, Asha Ghughtyal, Ankur Kumar, Gurudutt Gupta, Anurag Mehta, C. V. Jawahar, P. K. Vinod

TL;DR

Addressing inter-observer variability in HER2 IHC scoring and data scarcity, the study introduces the IPD-Breast dataset and evaluates three modeling paradigms for HER2 3-way classification: MIL-based patch aggregation, end-to-end slide-level ConvNeXt, and patch-level classification pipelines. The end-to-end ConvNeXt on low-resolution slides delivers the strongest 3-way performance (AUC 91.79, F1 83.52, accuracy 83.56), with moderate-resolution patch-level methods offering competitive AUC around 94 for four-way tasks, highlighting a trade-off between granularity and efficiency. Importantly, increasing resolution did not reliably boost performance, suggesting that aggregate context and patch-level decision fusion are key to accurate slide-level HER2 scoring. The findings support integrating AI-driven HER2 scoring into clinical workflows, with plans for hospital validation via an API and consideration of cross-scanner variability and labeling ambiguity.

Abstract

Breast cancer, the most common malignancy among women, requires precise detection and classification for effective treatment. Immunohistochemistry (IHC) biomarkers like HER2, ER, and PR are critical for identifying breast cancer subtypes. However, traditional IHC classification relies on pathologists' expertise, making it labor-intensive and subject to significant inter-observer variability. To address these challenges, this study introduces the India Pathology Breast Cancer Dataset (IPD-Breast), comprising of 1,272 IHC slides (HER2, ER, and PR) aimed at automating receptor status classification. The primary focus is on developing predictive models for HER2 3-way classification (0, Low, High) to enhance prognosis. Evaluation of multiple deep learning models revealed that an end-to-end ConvNeXt network utilizing low-resolution IHC images achieved an AUC, F1, and accuracy of 91.79%, 83.52%, and 83.56%, respectively, for 3-way classification, outperforming patch-based methods by over 5.35% in F1 score. This study highlights the potential of simple yet effective deep learning techniques to significantly improve accuracy and reproducibility in breast cancer classification, supporting their integration into clinical workflows for better patient outcomes.

Contrasting Low and High-Resolution Features for HER2 Scoring using Deep Learning

TL;DR

Abstract

Contrasting Low and High-Resolution Features for HER2 Scoring using Deep Learning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)