Table of Contents
Fetching ...

The Ethics of ChatGPT in Medicine and Healthcare: A Systematic Review on Large Language Models (LLMs)

Joschka Haltaufderheide, Robert Ranisch

TL;DR

The paper surveys the ethical landscape of deploying large language models (LLMs) like ChatGPT in medicine and healthcare, addressing a gap in systematic syntheses amid rapid adoption. Using a registered protocol and rapid-review methods, it analyzes 53 records through a meta-aggregative approach to identify four application domains: clinical, patient support, professional support, and public health. It identifies benefits in data analysis, personalized information, decision support, and access to information, but highlights pervasive ethical concerns—fairness, bias, non-maleficence, transparency, privacy—and a distinctive risk of misinformation or hallucinations, underscoring the need for human oversight and validation. The authors advocate reframing ethical guidance to define acceptable human oversight across diverse settings and call for gradual, justified experimentation with LLMs in healthcare, alongside governance and methodological rigor to ensure patient safety and trust.

Abstract

With the introduction of ChatGPT, Large Language Models (LLMs) have received enormous attention in healthcare. Despite their potential benefits, researchers have underscored various ethical implications. While individual instances have drawn much attention, the debate lacks a systematic overview of practical applications currently researched and ethical issues connected to them. Against this background, this work aims to map the ethical landscape surrounding the current stage of deployment of LLMs in medicine and healthcare. Electronic databases and preprint servers were queried using a comprehensive search strategy. Studies were screened and extracted following a modified rapid review approach. Methodological quality was assessed using a hybrid approach. For 53 records, a meta-aggregative synthesis was performed. Four fields of applications emerged and testify to a vivid exploration phase. Advantages of using LLMs are attributed to their capacity in data analysis, personalized information provisioning, support in decision-making, mitigating information loss and enhancing information accessibility. However, we also identifies recurrent ethical concerns connected to fairness, bias, non-maleficence, transparency, and privacy. A distinctive concern is the tendency to produce harmful misinformation or convincingly but inaccurate content. A recurrent plea for ethical guidance and human oversight is evident. Given the variety of use cases, it is suggested that the ethical guidance debate be reframed to focus on defining what constitutes acceptable human oversight across the spectrum of applications. This involves considering diverse settings, varying potentials for harm, and different acceptable thresholds for performance and certainty in healthcare. In addition, a critical inquiry is necessary to determine the extent to which the current experimental use of LLMs is necessary and justified.

The Ethics of ChatGPT in Medicine and Healthcare: A Systematic Review on Large Language Models (LLMs)

TL;DR

The paper surveys the ethical landscape of deploying large language models (LLMs) like ChatGPT in medicine and healthcare, addressing a gap in systematic syntheses amid rapid adoption. Using a registered protocol and rapid-review methods, it analyzes 53 records through a meta-aggregative approach to identify four application domains: clinical, patient support, professional support, and public health. It identifies benefits in data analysis, personalized information, decision support, and access to information, but highlights pervasive ethical concerns—fairness, bias, non-maleficence, transparency, privacy—and a distinctive risk of misinformation or hallucinations, underscoring the need for human oversight and validation. The authors advocate reframing ethical guidance to define acceptable human oversight across diverse settings and call for gradual, justified experimentation with LLMs in healthcare, alongside governance and methodological rigor to ensure patient safety and trust.

Abstract

With the introduction of ChatGPT, Large Language Models (LLMs) have received enormous attention in healthcare. Despite their potential benefits, researchers have underscored various ethical implications. While individual instances have drawn much attention, the debate lacks a systematic overview of practical applications currently researched and ethical issues connected to them. Against this background, this work aims to map the ethical landscape surrounding the current stage of deployment of LLMs in medicine and healthcare. Electronic databases and preprint servers were queried using a comprehensive search strategy. Studies were screened and extracted following a modified rapid review approach. Methodological quality was assessed using a hybrid approach. For 53 records, a meta-aggregative synthesis was performed. Four fields of applications emerged and testify to a vivid exploration phase. Advantages of using LLMs are attributed to their capacity in data analysis, personalized information provisioning, support in decision-making, mitigating information loss and enhancing information accessibility. However, we also identifies recurrent ethical concerns connected to fairness, bias, non-maleficence, transparency, and privacy. A distinctive concern is the tendency to produce harmful misinformation or convincingly but inaccurate content. A recurrent plea for ethical guidance and human oversight is evident. Given the variety of use cases, it is suggested that the ethical guidance debate be reframed to focus on defining what constitutes acceptable human oversight across the spectrum of applications. This involves considering diverse settings, varying potentials for harm, and different acceptable thresholds for performance and certainty in healthcare. In addition, a critical inquiry is necessary to determine the extent to which the current experimental use of LLMs is necessary and justified.
Paper Structure (19 sections, 2 figures, 6 tables)