Table of Contents
Fetching ...

A Survey on Medical Document Summarization

Raghav Jain, Anubhav Jangra, Sriparna Saha, Adam Jatowt

TL;DR

This survey addresses the burgeoning need for medical document summarization (MDS) amid vast digital medical records. It presents a taxonomy of MDS tasks by document type and input/output modality, reviews datasets and challenges across Research Articles, Health Records, Reports, Dialogues, and Patient Health Questions, and analyzes evaluation metrics including domain-specific measures. It also discusses ethical considerations and proposes future directions such as multimodal and numeracy-capable, explainable, and KB-enhanced approaches. The work highlights fragmentation in the field and provides a structured framework to standardize research and support responsible deployment in clinical settings.

Abstract

The internet has had a dramatic effect on the healthcare industry, allowing documents to be saved, shared, and managed digitally. This has made it easier to locate and share important data, improving patient care and providing more opportunities for medical studies. As there is so much data accessible to doctors and patients alike, summarizing it has become increasingly necessary - this has been supported through the introduction of deep learning and transformer-based networks, which have boosted the sector significantly in recent years. This paper gives a comprehensive survey of the current techniques and trends in medical summarization

A Survey on Medical Document Summarization

TL;DR

This survey addresses the burgeoning need for medical document summarization (MDS) amid vast digital medical records. It presents a taxonomy of MDS tasks by document type and input/output modality, reviews datasets and challenges across Research Articles, Health Records, Reports, Dialogues, and Patient Health Questions, and analyzes evaluation metrics including domain-specific measures. It also discusses ethical considerations and proposes future directions such as multimodal and numeracy-capable, explainable, and KB-enhanced approaches. The work highlights fragmentation in the field and provides a structured framework to standardize research and support responsible deployment in clinical settings.

Abstract

The internet has had a dramatic effect on the healthcare industry, allowing documents to be saved, shared, and managed digitally. This has made it easier to locate and share important data, improving patient care and providing more opportunities for medical studies. As there is so much data accessible to doctors and patients alike, summarizing it has become increasingly necessary - this has been supported through the introduction of deep learning and transformer-based networks, which have boosted the sector significantly in recent years. This paper gives a comprehensive survey of the current techniques and trends in medical summarization
Paper Structure (20 sections, 3 equations, 6 figures, 5 tables)

This paper contains 20 sections, 3 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Trend in medical document summarization research over last two decades. X-axis: year, Y-axis: #papers on medical document summarization published in each year. The growing number of papers in the recent 3 years suggests that there should be more coming in the next years.
  • Figure 2: Illustration of distribution of existing work with respect to different MDS subtasks. Here, PQ: Patient Health Question, RA: Research Articles, HR: Health Records, MD: Medical Dialogue, RT: Report.
  • Figure 3: Visual representation of MDS subtasks.
  • Figure 4: Illustration of dataset distribution with respect to different MDS subtasks. Here, PQ: Patient Health Question, RA: Research Articles, HR: Health Records, MD: Medical Dialogue, RT: Report.
  • Figure 5: Visual representation of proposed taxonomy based on input type, output type and method type.
  • ...and 1 more figures