Digital Diagnostics: The Potential Of Large Language Models In Recognizing Symptoms Of Common Illnesses
Gaurav Kumar Gupta, Aditi Singh, Sijo Valayakkad Manikandan, Abul Ehtesham
TL;DR
This study evaluates GPT-4, Gemini, and GPT-3.5 on diagnosing common illnesses from symptom descriptions, using prompts derived from CDC, WHO, Mayo Clinic, and other major sources. It employs a structured evaluation with precision, recall, and F1 metrics, revealing GPT-4 as the top performer, Gemini as a high-precision but conservative predictor, and GPT-3.5 as a reliable baseline. The authors discuss privacy, HIPAA compliance, and ethical considerations, highlighting the need for careful integration and multidisciplinary oversight. The work demonstrates the potential of AI-assisted digital diagnostics to improve speed and accuracy in initial medical assessments while outlining key avenues for future multimodal, multilingual, and real-world clinical validation.
Abstract
The recent swift development of LLMs like GPT-4, Gemini, and GPT-3.5 offers a transformative opportunity in medicine and healthcare, especially in digital diagnostics. This study evaluates each model diagnostic abilities by interpreting a user symptoms and determining diagnoses that fit well with common illnesses, and it demonstrates how each of these models could significantly increase diagnostic accuracy and efficiency. Through a series of diagnostic prompts based on symptoms from medical databases, GPT-4 demonstrates higher diagnostic accuracy from its deep and complete history of training on medical data. Meanwhile, Gemini performs with high precision as a critical tool in disease triage, demonstrating its potential to be a reliable model when physicians are trying to make high-risk diagnoses. GPT-3.5, though slightly less advanced, is a good tool for medical diagnostics. This study highlights the need to study LLMs for healthcare and clinical practices with more care and attention, ensuring that any system utilizing LLMs promotes patient privacy and complies with health information privacy laws such as HIPAA compliance, as well as the social consequences that affect the varied individuals in complex healthcare contexts. This study marks the start of a larger future effort to study the various ways in which assigning ethical concerns to LLMs task of learning from human biases could unearth new ways to apply AI in complex medical settings.
