Table of Contents
Fetching ...

Decoding the visual attention of pathologists to reveal their level of expertise

Souradeep Chakraborty, Dana Perez, Paul Friedman, Natallia Sheuka, Constantin Friedman, Oksana Yaskiv, Rajarsi Gupta, Gregory J. Zelinsky, Joel H. Saltz, Dimitris Samaras

TL;DR

This work addresses variability in pathologist expertise during Gleason grading of prostate whole-slide images by analyzing attention patterns. It collects the largest known dataset of pathologist attention (43 pathologists, 123 WSIs) and develops ProstAttFormer, a transformer-based model that predicts attention heatmaps across magnifications, and ExpertiseNet, a CNN that classifies expertise from attention patterns. The models achieve above-chance performance, with residents, generalists, and specialists predicted at 75.3%, 56.1%, and 77.2% accuracy respectively in the abstract, and demonstrate that specialists show higher attention–grading concordance while attention variability maps to grading agreement. The work offers a path toward objective expertise assessment and AI-assisted training that helps trainees read WSIs with expert-like focus.

Abstract

We present a method for classifying the expertise of a pathologist based on how they allocated their attention during a cancer reading. We engage this decoding task by developing a novel method for predicting the attention of pathologists as they read whole-slide Images (WSIs) of prostate and make cancer grade classifications. Our ground truth measure of a pathologists' attention is the x, y and z (magnification) movement of their viewport as they navigated through WSIs during readings, and to date we have the attention behavior of 43 pathologists reading 123 WSIs. These data revealed that specialists have higher agreement in both their attention and cancer grades compared to general pathologists and residents, suggesting that sufficient information may exist in their attention behavior to classify their expertise level. To attempt this, we trained a transformer-based model to predict the visual attention heatmaps of resident, general, and specialist (GU) pathologists during Gleason grading. Based solely on a pathologist's attention during a reading, our model was able to predict their level of expertise with 75.3%, 56.1%, and 77.2% accuracy, respectively, better than chance and baseline models. Our model therefore enables a pathologist's expertise level to be easily and objectively evaluated, important for pathology training and competency assessment. Tools developed from our model could also be used to help pathology trainees learn how to read WSIs like an expert.

Decoding the visual attention of pathologists to reveal their level of expertise

TL;DR

This work addresses variability in pathologist expertise during Gleason grading of prostate whole-slide images by analyzing attention patterns. It collects the largest known dataset of pathologist attention (43 pathologists, 123 WSIs) and develops ProstAttFormer, a transformer-based model that predicts attention heatmaps across magnifications, and ExpertiseNet, a CNN that classifies expertise from attention patterns. The models achieve above-chance performance, with residents, generalists, and specialists predicted at 75.3%, 56.1%, and 77.2% accuracy respectively in the abstract, and demonstrate that specialists show higher attention–grading concordance while attention variability maps to grading agreement. The work offers a path toward objective expertise assessment and AI-assisted training that helps trainees read WSIs with expert-like focus.

Abstract

We present a method for classifying the expertise of a pathologist based on how they allocated their attention during a cancer reading. We engage this decoding task by developing a novel method for predicting the attention of pathologists as they read whole-slide Images (WSIs) of prostate and make cancer grade classifications. Our ground truth measure of a pathologists' attention is the x, y and z (magnification) movement of their viewport as they navigated through WSIs during readings, and to date we have the attention behavior of 43 pathologists reading 123 WSIs. These data revealed that specialists have higher agreement in both their attention and cancer grades compared to general pathologists and residents, suggesting that sufficient information may exist in their attention behavior to classify their expertise level. To attempt this, we trained a transformer-based model to predict the visual attention heatmaps of resident, general, and specialist (GU) pathologists during Gleason grading. Based solely on a pathologist's attention during a reading, our model was able to predict their level of expertise with 75.3%, 56.1%, and 77.2% accuracy, respectively, better than chance and baseline models. Our model therefore enables a pathologist's expertise level to be easily and objectively evaluated, important for pathology training and competency assessment. Tools developed from our model could also be used to help pathology trainees learn how to read WSIs like an expert.
Paper Structure (12 sections, 1 equation, 5 figures, 3 tables)

This paper contains 12 sections, 1 equation, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Attention heatmaps computed for GU specialists (top-right) and general and resident pathologists (bottom). More detailed heatmaps are also shown for different levels of magnification and viewing durations. Left column, upper-right: grade-level segmentation of a WSI by a GU specialist. The attention heatmaps of GU specialists correlate higher with the tumor annotations compared to the non-specialists, and the specialists have the highest grading accuracy.
  • Figure 2: Grade concordance vs. attention heatmap correlation across three groups of pathologists based on their expertise level. Each point represents a WSI. $PG=3$ and $PG>=4$ indicate the number of instances in which a WSI was assigned primary grade (PG) = 3 and $PG \geq 4$ respectively.
  • Figure 3: Proposed attention prediction model ProstAttFormer that predicts pathologists attention on a WSI at different magnification levels.
  • Figure 4: ExpertiseNet, our pathologist expertise prediction model based on their attention. We input: (1) frozen ViT feature descriptors from a self-supervised learning model (arranged in 2D), (2) temporal attention heatmaps i.e. the cumulative attention heatmaps at different viewing durations, and (3) magnification-wise attention heatmaps to the model, which predicts the pathologist expertise.
  • Figure 5: Comparison of attention heatmap prediction performance of the proposed model ProstAttFormer with other baselines. Our ProstAttFormer model better predicts the heatmaps compared to other baselines across all magnifications for both of the two test WSI instances (cases 1 and 2).