Table of Contents
Fetching ...

e-Health CSIRO at "Discharge Me!" 2024: Generating Discharge Summary Sections with Fine-tuned Language Models

Jinghui Liu, Aaron Nicolson, Jason Dowling, Bevan Koopman, Anthony Nguyen

TL;DR

The paper investigates automatic generation of discharge summary sections (BHC and DI) from discharge notes using fine-tuned open LMs, evaluated on MIMIC-IV data. It compares decoder-only and encoder–decoder architectures, explores input context and specialized targets, and assesses decoding and ensemble strategies. Key findings show that smaller encoder–decoder models like PRIMERA can rival or outperform larger decoder-only models when finetuned with LoRA, and that target specialization into two LMs (one per section) yields better results than a unified model; radiology input and prolonged context provide limited benefits. The work demonstrates feasibility and offers design guidance for clinical NLP systems aimed at reducing documentation time, achieving 3rd place on the Discharge Me! leaderboard and releasing checkpoints for public use.

Abstract

Clinical documentation is an important aspect of clinicians' daily work and often demands a significant amount of time. The BioNLP 2024 Shared Task on Streamlining Discharge Documentation (Discharge Me!) aims to alleviate this documentation burden by automatically generating discharge summary sections, including brief hospital course and discharge instruction, which are often time-consuming to synthesize and write manually. We approach the generation task by fine-tuning multiple open-sourced language models (LMs), including both decoder-only and encoder-decoder LMs, with various configurations on input context. We also examine different setups for decoding algorithms, model ensembling or merging, and model specialization. Our results show that conditioning on the content of discharge summary prior to the target sections is effective for the generation task. Furthermore, we find that smaller encoder-decoder LMs can work as well or even slightly better than larger decoder based LMs fine-tuned through LoRA. The model checkpoints from our team (aehrc) are openly available.

e-Health CSIRO at "Discharge Me!" 2024: Generating Discharge Summary Sections with Fine-tuned Language Models

TL;DR

The paper investigates automatic generation of discharge summary sections (BHC and DI) from discharge notes using fine-tuned open LMs, evaluated on MIMIC-IV data. It compares decoder-only and encoder–decoder architectures, explores input context and specialized targets, and assesses decoding and ensemble strategies. Key findings show that smaller encoder–decoder models like PRIMERA can rival or outperform larger decoder-only models when finetuned with LoRA, and that target specialization into two LMs (one per section) yields better results than a unified model; radiology input and prolonged context provide limited benefits. The work demonstrates feasibility and offers design guidance for clinical NLP systems aimed at reducing documentation time, achieving 3rd place on the Discharge Me! leaderboard and releasing checkpoints for public use.

Abstract

Clinical documentation is an important aspect of clinicians' daily work and often demands a significant amount of time. The BioNLP 2024 Shared Task on Streamlining Discharge Documentation (Discharge Me!) aims to alleviate this documentation burden by automatically generating discharge summary sections, including brief hospital course and discharge instruction, which are often time-consuming to synthesize and write manually. We approach the generation task by fine-tuning multiple open-sourced language models (LMs), including both decoder-only and encoder-decoder LMs, with various configurations on input context. We also examine different setups for decoding algorithms, model ensembling or merging, and model specialization. Our results show that conditioning on the content of discharge summary prior to the target sections is effective for the generation task. Furthermore, we find that smaller encoder-decoder LMs can work as well or even slightly better than larger decoder based LMs fine-tuned through LoRA. The model checkpoints from our team (aehrc) are openly available.
Paper Structure (18 sections, 2 figures, 7 tables)