Table of Contents
Fetching ...

Segmentation, Classification and Interpretation of Breast Cancer Medical Images using Human-in-the-Loop Machine Learning

David Vázquez-Lema, Eduardo Mosqueira-Rey, Elena Hernández-Pereira, Carlos Fernández-Lozano, Fernando Seara-Romera, Jorge Pombo-Otero

TL;DR

The paper investigates how Human-in-the-Loop strategies can enhance machine learning for breast cancer analysis by integrating genomic data with Whole Slide Imaging. It develops three tasks—segmentation, genomic-subtype classification, and interpretation—highlighting how pathologist input improves segmentation and explainability but not necessarily classification performance. The study employs Deep Multi-Magnification Network for segmentation, pretrained CNNs for classification, and LIME/SHAP/Grad-CAM for interpretation, coupled with Bayesian optimization driven by expert feedback. The findings demonstrate HITL can improve transparency and debugging in complex medical imaging tasks, but also reveal limitations due to data scarcity and intrinsic complexity of genomic-WSI signals, calling for larger datasets and further methodological development.

Abstract

This paper explores the application of Human-in-the-Loop (HITL) strategies in training machine learning models in the medical domain. In this case a doctor-in-the-loop approach is proposed to leverage human expertise in dealing with large and complex data. Specifically, the paper deals with the integration of genomic data and Whole Slide Imaging (WSI) analysis of breast cancer. Three different tasks were developed: segmentation of histopathological images, classification of this images regarding the genomic subtype of the cancer and, finally, interpretation of the machine learning results. The involvement of a pathologist helped us to develop a better segmentation model and to enhance the explainatory capabilities of the models, but the classification results were suboptimal, highlighting the limitations of this approach: despite involving human experts, complex domains can still pose challenges, and a HITL approach may not always be effective.

Segmentation, Classification and Interpretation of Breast Cancer Medical Images using Human-in-the-Loop Machine Learning

TL;DR

The paper investigates how Human-in-the-Loop strategies can enhance machine learning for breast cancer analysis by integrating genomic data with Whole Slide Imaging. It develops three tasks—segmentation, genomic-subtype classification, and interpretation—highlighting how pathologist input improves segmentation and explainability but not necessarily classification performance. The study employs Deep Multi-Magnification Network for segmentation, pretrained CNNs for classification, and LIME/SHAP/Grad-CAM for interpretation, coupled with Bayesian optimization driven by expert feedback. The findings demonstrate HITL can improve transparency and debugging in complex medical imaging tasks, but also reveal limitations due to data scarcity and intrinsic complexity of genomic-WSI signals, calling for larger datasets and further methodological development.

Abstract

This paper explores the application of Human-in-the-Loop (HITL) strategies in training machine learning models in the medical domain. In this case a doctor-in-the-loop approach is proposed to leverage human expertise in dealing with large and complex data. Specifically, the paper deals with the integration of genomic data and Whole Slide Imaging (WSI) analysis of breast cancer. Three different tasks were developed: segmentation of histopathological images, classification of this images regarding the genomic subtype of the cancer and, finally, interpretation of the machine learning results. The involvement of a pathologist helped us to develop a better segmentation model and to enhance the explainatory capabilities of the models, but the classification results were suboptimal, highlighting the limitations of this approach: despite involving human experts, complex domains can still pose challenges, and a HITL approach may not always be effective.
Paper Structure (30 sections, 12 figures, 7 tables)

This paper contains 30 sections, 12 figures, 7 tables.

Figures (12)

  • Figure 1: (a) Percentage of downloaded images belonging to each cancer type (b) Percentage of the final dataset images belonging to each cancer type.
  • Figure 2: Classification of HITL methods.
  • Figure 4: (a) Segmented image obtained by the network superposed over the original one and (b) original Whole Slide Image (WSI).
  • Figure 5: Corrections proposed by the pathologist (left) and segmented image after applying the pathologist's corrections.
  • Figure 6: Architecture of the Xception model. The numbers in each block represent the kernel size, the number of filters in the current block and the stride.
  • ...and 7 more figures