Guardrails for avoiding harmful medical product recommendations and off-label promotion in generative AI models
Daniel Lopez-Martinez
TL;DR
The paper addresses the risk that generalist GenAI systems can promote unapproved medical product uses. It proposes a four-stage detection pipeline—input standardization, NER for drugs and indications, FDA-label mapping via FDALabel embeddings, and off-label identification using a T5-large zero-shot check—to identify off-label promotion in multimodal outputs, demonstrated on Claude 3. A synthetic red-teaming approach yielded 14,300 queries across 35 drugs, revealing off-label prompts in 15.4% of responses and achieving precision 85.75%, recall 80.47%, and F1 83.02% on a 2,000-sample human-annotated subset. The work underscores the need for guardrails to ensure FDA-approved labeling adherence in GenAI medical applications and offers a concrete, scalable method for post-hoc monitoring, while noting limitations such as scope and model generalizability.
Abstract
Generative AI (GenAI) models have demonstrated remarkable capabilities in a wide variety of medical tasks. However, as these models are trained using generalist datasets with very limited human oversight, they can learn uses of medical products that have not been adequately evaluated for safety and efficacy, nor approved by regulatory agencies. Given the scale at which GenAI may reach users, unvetted recommendations pose a public health risk. In this work, we propose an approach to identify potentially harmful product recommendations, and demonstrate it using a recent multimodal large language model.
