Table of Contents
Fetching ...

WsiCaption: Multiple Instance Generation of Pathology Reports for Gigapixel Whole-Slide Images

Pingyi Chen, Honglin Li, Chenglu Zhu, Sunyi Zheng, Zhongyi Shui, Lin Yang

TL;DR

The multiple instance generative model (MI-Gen) which can produce pathology reports for gigapixel WSIs is proposed and Experimental results show the model can generate pathology reports which contain multiple clinical clues and achieve competitive performance on certain slide-level tasks.

Abstract

Whole slide images are the foundation of digital pathology for the diagnosis and treatment of carcinomas. Writing pathology reports is laborious and error-prone for inexperienced pathologists. To reduce the workload and improve clinical automation, we investigate how to generate pathology reports given whole slide images. On the data end, we curated the largest WSI-text dataset (PathText). In specific, we collected nearly 10000 high-quality WSI-text pairs for visual-language models by recognizing and cleaning pathology reports which narrate diagnostic slides in TCGA. On the model end, we propose the multiple instance generative model (MI-Gen) which can produce pathology reports for gigapixel WSIs. We benchmark our model on the largest subset of TCGA-PathoText. Experimental results show our model can generate pathology reports which contain multiple clinical clues and achieve competitive performance on certain slide-level tasks. We observe that simple semantic extraction from the pathology reports can achieve the best performance (0.838 of F1 score) on BRCA subtyping surpassing previous state-of-the-art approaches. Our collected dataset and related code are available.

WsiCaption: Multiple Instance Generation of Pathology Reports for Gigapixel Whole-Slide Images

TL;DR

The multiple instance generative model (MI-Gen) which can produce pathology reports for gigapixel WSIs is proposed and Experimental results show the model can generate pathology reports which contain multiple clinical clues and achieve competitive performance on certain slide-level tasks.

Abstract

Whole slide images are the foundation of digital pathology for the diagnosis and treatment of carcinomas. Writing pathology reports is laborious and error-prone for inexperienced pathologists. To reduce the workload and improve clinical automation, we investigate how to generate pathology reports given whole slide images. On the data end, we curated the largest WSI-text dataset (PathText). In specific, we collected nearly 10000 high-quality WSI-text pairs for visual-language models by recognizing and cleaning pathology reports which narrate diagnostic slides in TCGA. On the model end, we propose the multiple instance generative model (MI-Gen) which can produce pathology reports for gigapixel WSIs. We benchmark our model on the largest subset of TCGA-PathoText. Experimental results show our model can generate pathology reports which contain multiple clinical clues and achieve competitive performance on certain slide-level tasks. We observe that simple semantic extraction from the pathology reports can achieve the best performance (0.838 of F1 score) on BRCA subtyping surpassing previous state-of-the-art approaches. Our collected dataset and related code are available.
Paper Structure (16 sections, 4 equations, 3 figures, 2 tables)

This paper contains 16 sections, 4 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: The pipeline of extracting WSI-text pairs from TCGA.
  • Figure 2: The framework of our proposed model is comprised of a visual extractor and an encoder-decoder.
  • Figure 3: Illustrations of pathology reports from our model, Vanilla Transformer, and ground-truth. The first column shows the thumbnails of the WSIs. The content that is consistent with the ground-truth is highlighted in bold. And the medical terms which are contradictory to ground-truth are underlined. Strikethrough text means there existing grammar errors.