A Survey on Medical Document Summarization
Raghav Jain, Anubhav Jangra, Sriparna Saha, Adam Jatowt
TL;DR
This survey addresses the burgeoning need for medical document summarization (MDS) amid vast digital medical records. It presents a taxonomy of MDS tasks by document type and input/output modality, reviews datasets and challenges across Research Articles, Health Records, Reports, Dialogues, and Patient Health Questions, and analyzes evaluation metrics including domain-specific measures. It also discusses ethical considerations and proposes future directions such as multimodal and numeracy-capable, explainable, and KB-enhanced approaches. The work highlights fragmentation in the field and provides a structured framework to standardize research and support responsible deployment in clinical settings.
Abstract
The internet has had a dramatic effect on the healthcare industry, allowing documents to be saved, shared, and managed digitally. This has made it easier to locate and share important data, improving patient care and providing more opportunities for medical studies. As there is so much data accessible to doctors and patients alike, summarizing it has become increasingly necessary - this has been supported through the introduction of deep learning and transformer-based networks, which have boosted the sector significantly in recent years. This paper gives a comprehensive survey of the current techniques and trends in medical summarization
