Table of Contents
Fetching ...

HILL: A Hallucination Identifier for Large Language Models

Florian Leiser, Sven Eckhardt, Valentin Leuthe, Merlin Knaeble, Alexander Maedche, Gerhard Schwabe, Ali Sunyaev

TL;DR

The paper addresses hallucinations in large language models and user overreliance by introducing HILL, a Hallucination Identifier designed through a user-centered Wizard of Oz process and implemented as a web-based interface interfacing with the OpenAI API. It combines a structured feature prioritization (via nine WOz sessions and best-worst scaling) with a functional artifact that presents a confidence score, source links, and a drill-down dashboard to help users identify and question potentially hallucinated content. Evaluation with 17 online participants, a 128-question SQuAD 2.0 test, and five user interviews suggests that HILL can effectively highlight hallucinations in wrong answers and support users in safer information consumption, while also highlighting risks of overreliance when hallucinations are missed. The work demonstrates the feasibility and value of user-centered AI artifacts that empower users to detect errors, offering a practical path for integrating such designs alongside technical mitigation approaches in real-world LLM deployments.

Abstract

Large language models (LLMs) are prone to hallucinations, i.e., nonsensical, unfaithful, and undesirable text. Users tend to overrely on LLMs and corresponding hallucinations which can lead to misinterpretations and errors. To tackle the problem of overreliance, we propose HILL, the "Hallucination Identifier for Large Language Models". First, we identified design features for HILL with a Wizard of Oz approach with nine participants. Subsequently, we implemented HILL based on the identified design features and evaluated HILL's interface design by surveying 17 participants. Further, we investigated HILL's functionality to identify hallucinations based on an existing question-answering dataset and five user interviews. We find that HILL can correctly identify and highlight hallucinations in LLM responses which enables users to handle LLM responses with more caution. With that, we propose an easy-to-implement adaptation to existing LLMs and demonstrate the relevance of user-centered designs of AI artifacts.

HILL: A Hallucination Identifier for Large Language Models

TL;DR

The paper addresses hallucinations in large language models and user overreliance by introducing HILL, a Hallucination Identifier designed through a user-centered Wizard of Oz process and implemented as a web-based interface interfacing with the OpenAI API. It combines a structured feature prioritization (via nine WOz sessions and best-worst scaling) with a functional artifact that presents a confidence score, source links, and a drill-down dashboard to help users identify and question potentially hallucinated content. Evaluation with 17 online participants, a 128-question SQuAD 2.0 test, and five user interviews suggests that HILL can effectively highlight hallucinations in wrong answers and support users in safer information consumption, while also highlighting risks of overreliance when hallucinations are missed. The work demonstrates the feasibility and value of user-centered AI artifacts that empower users to detect errors, offering a practical path for integrating such designs alongside technical mitigation approaches in real-world LLM deployments.

Abstract

Large language models (LLMs) are prone to hallucinations, i.e., nonsensical, unfaithful, and undesirable text. Users tend to overrely on LLMs and corresponding hallucinations which can lead to misinterpretations and errors. To tackle the problem of overreliance, we propose HILL, the "Hallucination Identifier for Large Language Models". First, we identified design features for HILL with a Wizard of Oz approach with nine participants. Subsequently, we implemented HILL based on the identified design features and evaluated HILL's interface design by surveying 17 participants. Further, we investigated HILL's functionality to identify hallucinations based on an existing question-answering dataset and five user interviews. We find that HILL can correctly identify and highlight hallucinations in LLM responses which enables users to handle LLM responses with more caution. With that, we propose an easy-to-implement adaptation to existing LLMs and demonstrate the relevance of user-centered designs of AI artifacts.
Paper Structure (25 sections, 1 equation, 2 figures, 5 tables)

This paper contains 25 sections, 1 equation, 2 figures, 5 tables.

Figures (2)

  • Figure 1: The positioning and delimitation of this study. While current approaches focus on correcting LLM hallucinations, we aim at empowering the users to not rely on hallucinations (bold). These approaches are not competing against each other but rather complementing and both are important to achieve the unified goal of users not blindly following incorrect LLM output.
  • Figure 2: Screenshot of the prototypical interface including the features of Group 1.