Table of Contents
Fetching ...

MedDec: A Dataset for Extracting Medical Decisions from Discharge Summaries

Mohamed Elgaar, Jiali Cheng, Nidhi Vakil, Hadi Amiri, Leo Anthony Celi

TL;DR

MedDec addresses the gap in resources for extracting medical decisions from clinical notes by introducing a carefully annotated dataset built on MIMIC-III discharge summaries with ten DICTUM decision categories across eleven phenotypes. The paper adopts a span-detection baseline using a transformer-based sequence-labeling framework with segment-wise processing and evaluates multiple models, accompanied by a span difficulty score to gauge sample complexity. It presents extensive results, showing RoBERTa as the strongest performer among baselines and reveals notable variability across phenotypes and the challenge of generalizing to unseen phenotypes, while also exploring IFT-based extraction with LLMs. The dataset and code release enable further research in clinical decision extraction, bias analysis, and decision-support applications, underlining both the practical value and limitations of current approaches.

Abstract

Medical decisions directly impact individuals' health and well-being. Extracting decision spans from clinical notes plays a crucial role in understanding medical decision-making processes. In this paper, we develop a new dataset called "MedDec", which contains clinical notes of eleven different phenotypes (diseases) annotated by ten types of medical decisions. We introduce the task of medical decision extraction, aiming to jointly extract and classify different types of medical decisions within clinical notes. We provide a comprehensive analysis of the dataset, develop a span detection model as a baseline for this task, evaluate recent span detection approaches, and employ a few metrics to measure the complexity of data samples. Our findings shed light on the complexities inherent in clinical decision extraction and enable future work in this area of research. The dataset and code are available through https://github.com/CLU-UML/MedDec.

MedDec: A Dataset for Extracting Medical Decisions from Discharge Summaries

TL;DR

MedDec addresses the gap in resources for extracting medical decisions from clinical notes by introducing a carefully annotated dataset built on MIMIC-III discharge summaries with ten DICTUM decision categories across eleven phenotypes. The paper adopts a span-detection baseline using a transformer-based sequence-labeling framework with segment-wise processing and evaluates multiple models, accompanied by a span difficulty score to gauge sample complexity. It presents extensive results, showing RoBERTa as the strongest performer among baselines and reveals notable variability across phenotypes and the challenge of generalizing to unseen phenotypes, while also exploring IFT-based extraction with LLMs. The dataset and code release enable further research in clinical decision extraction, bias analysis, and decision-support applications, underlining both the practical value and limitations of current approaches.

Abstract

Medical decisions directly impact individuals' health and well-being. Extracting decision spans from clinical notes plays a crucial role in understanding medical decision-making processes. In this paper, we develop a new dataset called "MedDec", which contains clinical notes of eleven different phenotypes (diseases) annotated by ten types of medical decisions. We introduce the task of medical decision extraction, aiming to jointly extract and classify different types of medical decisions within clinical notes. We provide a comprehensive analysis of the dataset, develop a span detection model as a baseline for this task, evaluate recent span detection approaches, and employ a few metrics to measure the complexity of data samples. Our findings shed light on the complexities inherent in clinical decision extraction and enable future work in this area of research. The dataset and code are available through https://github.com/CLU-UML/MedDec.
Paper Structure (24 sections, 2 equations, 4 figures, 6 tables)

This paper contains 24 sections, 2 equations, 4 figures, 6 tables.

Figures (4)

  • Figure 1: An example excerpt from a de-identified clinical note in MedDec, where text spans are annotated into 10 medical decision categories defined by the Decision Identification and Classification Taxonomy for Use in Medicine (DICTUM) ofstad2016medical. Color-coded texts represent medical decisions and their annotated decision categories are in [ brackets].
  • Figure 2: Architecture of the proposed framework for medical span detection. The framework is a multi-class sequence labeling approach that fine-tunes a pre-trained transformer network for span detection.
  • Figure 3: Span detection F1 score on spans with increasing difficulty for two difficulty scores. The shaded area is the 95% confidence interval for three models: ELECTRA, RoBERTa, and BioClinical-BERT.
  • Figure 4: F1 score performance of span detection at phenotype level. The orange bars show the generalizability performance of the model when the phenotype is unseen during training.