Table of Contents
Fetching ...

Mitigating Hallucinations of Large Language Models in Medical Information Extraction via Contrastive Decoding

Derong Xu, Ziheng Zhang, Zhihong Zhu, Zhenxi Lin, Qidong Liu, Xian Wu, Tong Xu, Xiangyu Zhao, Yefeng Zheng, Enhong Chen

TL;DR

ALCD demonstrates significant improvements in resolving hallucination issues compared to conventional decoding methods and proposes an alternate adaptive constraint strategy to more effectively adjust the scale and scope of contrastive tokens.

Abstract

The impressive capabilities of large language models (LLMs) have attracted extensive interests of applying LLMs to medical field. However, the complex nature of clinical environments presents significant hallucination challenges for LLMs, hindering their widespread adoption. In this paper, we address these hallucination issues in the context of Medical Information Extraction (MIE) tasks by introducing ALternate Contrastive Decoding (ALCD). We begin by redefining MIE tasks as an identify-and-classify process. We then separate the identification and classification functions of LLMs by selectively masking the optimization of tokens during fine-tuning. During the inference stage, we alternately contrast output distributions derived from sub-task models. This approach aims to selectively enhance the identification and classification capabilities while minimizing the influence of other inherent abilities in LLMs. Additionally, we propose an alternate adaptive constraint strategy to more effectively adjust the scale and scope of contrastive tokens. Through comprehensive experiments on two different backbones and six diverse medical information extraction tasks, ALCD demonstrates significant improvements in resolving hallucination issues compared to conventional decoding methods.

Mitigating Hallucinations of Large Language Models in Medical Information Extraction via Contrastive Decoding

TL;DR

ALCD demonstrates significant improvements in resolving hallucination issues compared to conventional decoding methods and proposes an alternate adaptive constraint strategy to more effectively adjust the scale and scope of contrastive tokens.

Abstract

The impressive capabilities of large language models (LLMs) have attracted extensive interests of applying LLMs to medical field. However, the complex nature of clinical environments presents significant hallucination challenges for LLMs, hindering their widespread adoption. In this paper, we address these hallucination issues in the context of Medical Information Extraction (MIE) tasks by introducing ALternate Contrastive Decoding (ALCD). We begin by redefining MIE tasks as an identify-and-classify process. We then separate the identification and classification functions of LLMs by selectively masking the optimization of tokens during fine-tuning. During the inference stage, we alternately contrast output distributions derived from sub-task models. This approach aims to selectively enhance the identification and classification capabilities while minimizing the influence of other inherent abilities in LLMs. Additionally, we propose an alternate adaptive constraint strategy to more effectively adjust the scale and scope of contrastive tokens. Through comprehensive experiments on two different backbones and six diverse medical information extraction tasks, ALCD demonstrates significant improvements in resolving hallucination issues compared to conventional decoding methods.

Paper Structure

This paper contains 25 sections, 8 equations, 5 figures, 8 tables.

Figures (5)

  • Figure 1: An example demonstrating the hallucination generated by LLMs in MIE tasks. The green font in medical dialogue indicates a high correlation with ground truth. The blue font in the output represents correct token, while the red font represents tokens with hallucination problems. These problems mainly include the presence of nonexistent entities and reasoning errors.
  • Figure 2: The overall pipeline of our proposed ALCD consists of two main steps. In Step #1, we aim to fine-tune sub-models individually to decouple the abilities of identification and classification. In Step #2, we adaptively contrast the predictions at each time step by applying scale and scope constraints on tokens. The figure shows how LLMs generate token $y_t$ at time step $t$ based on previous tokens $\boldsymbol{y}_{<t}$. The terms $cls, ide, other$ represent classification, identification, and other tokens, respectively. The output logits of normal, classification and identification models are represented as $l^{\theta}_{nl}$, $l^{\theta}_{cl}$, and $l^{\theta}_{id}$.
  • Figure 3: Ablation study on six medical datasets using ChatGLM-6B.
  • Figure 4: (a) Analysis of the scale of contrasting prediction $\alpha$ (in Formula \ref{['equation:Contrasting the Predictions']}); (b) Analysis of max rate of constraint $\beta$ (in Formula \ref{['equation:constraints']}).
  • Figure 5: Analysis of varying decoupling steps during fine-tuning on IMCS-V2-SR dataset. 'Vanilla' refers to the performance of normal model using greedy search.