Table of Contents
Fetching ...

Supporting Mitosis Detection AI Training with Inter-Observer Eye-Gaze Consistencies

Hongyan Gu, Zihan Yan, Ayesha Alvi, Brandon Day, Chunxu Yang, Zida Wu, Shino Magaki, Mohammad Haeri, Xiang 'Anthony' Chen

TL;DR

This work proposes using inter-observer eye-gaze consistency as a cost-effective source of training labels for mitosis detection in pathology. By aggregating fixations across groups of participants and extracting centroids from heatmap hotspots, the authors generate eye-gaze labels that guide CNN training (EfficientNet-$b3$) through a two-iteration active-learning and GradCAM++-based localization pipeline. Compared to a heuristic color-based labeling and to ground-truth annotations, CNNs trained on eye-gaze labels closely approach ground-truth performance and significantly outperform the heuristic baseline, with a notable improvement in precision as the group size increases to $k=14$. The study demonstrates a practical, non-disruptive data collection approach that could generalize to other medical imaging tasks, albeit with a remaining recall gap relative to expert-labeled data and the need for validation with pathologist participants.

Abstract

The expansion of artificial intelligence (AI) in pathology tasks has intensified the demand for doctors' annotations in AI development. However, collecting high-quality annotations from doctors is costly and time-consuming, creating a bottleneck in AI progress. This study investigates eye-tracking as a cost-effective technology to collect doctors' behavioral data for AI training with a focus on the pathology task of mitosis detection. One major challenge in using eye-gaze data is the low signal-to-noise ratio, which hinders the extraction of meaningful information. We tackled this by levering the properties of inter-observer eye-gaze consistencies and creating eye-gaze labels from consistent eye-fixations shared by a group of observers. Our study involved 14 non-medical participants, from whom we collected eye-gaze data and generated eye-gaze labels based on varying group sizes. We assessed the efficacy of such eye-gaze labels by training Convolutional Neural Networks (CNNs) and comparing their performance to those trained with ground truth annotations and a heuristic-based baseline. Results indicated that CNNs trained with our eye-gaze labels closely followed the performance of ground-truth-based CNNs, and significantly outperformed the baseline. Although primarily focused on mitosis, we envision that insights from this study can be generalized to other medical imaging tasks.

Supporting Mitosis Detection AI Training with Inter-Observer Eye-Gaze Consistencies

TL;DR

This work proposes using inter-observer eye-gaze consistency as a cost-effective source of training labels for mitosis detection in pathology. By aggregating fixations across groups of participants and extracting centroids from heatmap hotspots, the authors generate eye-gaze labels that guide CNN training (EfficientNet-) through a two-iteration active-learning and GradCAM++-based localization pipeline. Compared to a heuristic color-based labeling and to ground-truth annotations, CNNs trained on eye-gaze labels closely approach ground-truth performance and significantly outperform the heuristic baseline, with a notable improvement in precision as the group size increases to . The study demonstrates a practical, non-disruptive data collection approach that could generalize to other medical imaging tasks, albeit with a remaining recall gap relative to expert-labeled data and the need for validation with pathologist participants.

Abstract

The expansion of artificial intelligence (AI) in pathology tasks has intensified the demand for doctors' annotations in AI development. However, collecting high-quality annotations from doctors is costly and time-consuming, creating a bottleneck in AI progress. This study investigates eye-tracking as a cost-effective technology to collect doctors' behavioral data for AI training with a focus on the pathology task of mitosis detection. One major challenge in using eye-gaze data is the low signal-to-noise ratio, which hinders the extraction of meaningful information. We tackled this by levering the properties of inter-observer eye-gaze consistencies and creating eye-gaze labels from consistent eye-fixations shared by a group of observers. Our study involved 14 non-medical participants, from whom we collected eye-gaze data and generated eye-gaze labels based on varying group sizes. We assessed the efficacy of such eye-gaze labels by training Convolutional Neural Networks (CNNs) and comparing their performance to those trained with ground truth annotations and a heuristic-based baseline. Results indicated that CNNs trained with our eye-gaze labels closely followed the performance of ground-truth-based CNNs, and significantly outperformed the baseline. Although primarily focused on mitosis, we envision that insights from this study can be generalized to other medical imaging tasks.
Paper Structure (15 sections, 2 figures)

This paper contains 15 sections, 2 figures.

Figures (2)

  • Figure 1: Methods: (a) A meningioma Whole Slide Image used for sampling the 1,000 HPF images. The specimen was stained with PHH3 immunohistochemistry. A (b) positive and (c) negative HPF image in the 1,000 HPF image collection. (d, e) Visualizations of 14 participants' eye-gaze sequences while viewing the HPF images of (b) and (c). (f, g) The corresponding eye-gaze heatmaps after processing. (h) Apparatus for eye-gaze user study. The participant was seated on the right side and the moderator who controlled the experiment was on the left. Screenshots of the image viewing interface used in the eye-tracking sessions: (i) the "Stand-by" page shown at the beginning of each trial; (j) in each trial, the system would display 40 HPF images in random order and with random image transforms; (k) the "End-of-Trial" page marking the end of each trial. Pipeline enabling CNNs to predict mitosis locations with probabilities: (l) the sliding window CNN was applied; (m) boxes with positive CNN classifications were retained to generate the (n) saliency map. The centroid of saliency map hotspots were used as the locations of detected mitoses. (o) The same CNN was re-applied to each location to calculate the probability.
  • Figure 2: Experiment results: (a) Box-whisker plot of time consumption of the 14 qualified participants (i.e., P1 -- P14) viewing images in the eye-tracking sessions. The (b) precision, (c) recall, and (d) F1 scores of the eye-gaze labels for the 800 HPF images in the eye-tracking sessions. (e) Ranges of precision-recall curves of the EfficientNet-b3 CNNs on the test WSI. The CNNs were trained from heuristic-based labels, eye-gaze labels ($k$=14), and ground truth. For each condition, a '$+$' marker is placed to represent the average performance with standard deviation. (e) Average precision, recall, and F1 scores achieved by the CNNs in the three conditions.