Few shot chain-of-thought driven reasoning to prompt LLMs for open ended medical question answering
Saeel Sandeep Nachane, Ojas Gramopadhye, Prateek Chanda, Ganesh Ramakrishnan, Kshitij Sharad Jadhav, Yatin Nandwani, Dinesh Raghu, Sachindra Joshi
TL;DR
This work addresses open-ended medical question answering by proposing MEDQA-OPEN and a chain-of-thought prompting framework (CLINICR) that mirrors incremental clinical reasoning. It compares two prompting strategies (Eliminative and ClinicR) and introduces a forward-backward workflow augmented by a reward-model verifier to improve answer reliability. Experiments show ClinicR generally outperforms Eliminative in open-ended settings, especially when paired with Verifier-based selection, achieving high expert-agreement on MedQA-Open and ClinicianCases. The approach advances clinically plausible, verifiable LLM reasoning and outlines paths for broader generalization and integration with retrieval and knowledge-grounding techniques.
Abstract
In this paper, we propose a modified version of the MedQA-USMLE dataset, named MEDQA-OPEN, which contains open-ended medical questions without options to mimic clinical scenarios, along with clinician-approved reasoned answers. Additionally, we implement a prompt driven by Chain of Thought (CoT) reasoning, CLINICR, to mirror the prospective process of incremental reasoning, reaching a correct response to medical questions. We empirically demonstrate how CLINICR outperforms the state-of-the-art 5-shot CoT-based prompt (Liévin et al., 2022). We also present an approach that mirrors real-life clinical practice by first exploring multiple differential diagnoses through MCQ-CLINICR and subsequently narrowing down to a final diagnosis using MCQ-ELIMINATIVE. Finally, emphasizing the importance of response verification in medical settings, we utilize a reward model mechanism, replacing the elimination process performed by MCQ-ELIMINATIVE.
