PromptMind Team at MEDIQA-CORR 2024: Improving Clinical Text Correction with Error Categorization and LLM Ensembles
Satya Kesav Gundabathula, Sriram R Kolar
TL;DR
The paper tackles automatic detection and correction of errors in clinical notes (MEDIQA-CORR) by proposing a unified prompt-based LLM approach with in-context learning and chain-of-thought to address detection, localization, and correction. A key contribution is error-type categorization to guide prompting, along with self-consistency and model ensembling (GPT-4 and Claude-3 Opus) to improve robustness, achieving competitive results and ranking second in subtask-3. The study demonstrates that structured prompt design and consensus strategies can enhance the reliability of AI-assisted clinical documentation while highlighting limitations and ethical considerations.
Abstract
This paper describes our approach to the MEDIQA-CORR shared task, which involves error detection and correction in clinical notes curated by medical professionals. This task involves handling three subtasks: detecting the presence of errors, identifying the specific sentence containing the error, and correcting it. Through our work, we aim to assess the capabilities of Large Language Models (LLMs) trained on a vast corpora of internet data that contain both factual and unreliable information. We propose to comprehensively address all subtasks together, and suggest employing a unique prompt-based in-context learning strategy. We will evaluate its efficacy in this specialized task demanding a combination of general reasoning and medical knowledge. In medical systems where prediction errors can have grave consequences, we propose leveraging self-consistency and ensemble methods to enhance error correction and error detection performance.
