Table of Contents
Fetching ...

Zero-Shot Multi-task Hallucination Detection

Patanjali Bhamidipati, Advaith Malladi, Manish Shrivastava, Radhika Mamidi

TL;DR

This work formalizes hallucination in natural language generation and proposes a zero-shot, task-aware detection framework that reframes detection as Natural Language Inference across DM, MT, and PG tasks. Using the SHROOM dataset with model-aware and model-agnostic settings, the authors show that entailment-based checks between model outputs and inputs or targets can effectively identify hallucinations with competitive accuracy ($$0.78$$ MAw, $$0.61$$ MAg) while remaining computationally lightweight. They demonstrate that definition modelling benefits from single-task entailment checks, whereas paraphrase generation and machine translation require bidirectional semantic equivalence to flag hallucinations. The study provides concrete definitions, a practical methodology, and a framework adaptable to additional NLG tasks, advancing reliable evaluation and safety in LLM-based generation.

Abstract

In recent studies, the extensive utilization of large language models has underscored the importance of robust evaluation methodologies for assessing text generation quality and relevance to specific tasks. This has revealed a prevalent issue known as hallucination, an emergent condition in the model where generated text lacks faithfulness to the source and deviates from the evaluation criteria. In this study, we formally define hallucination and propose a framework for its quantitative detection in a zero-shot setting, leveraging our definition and the assumption that model outputs entail task and sample specific inputs. In detecting hallucinations, our solution achieves an accuracy of 0.78 in a model-aware setting and 0.61 in a model-agnostic setting. Notably, our solution maintains computational efficiency, requiring far less computational resources than other SOTA approaches, aligning with the trend towards lightweight and compressed models.

Zero-Shot Multi-task Hallucination Detection

TL;DR

This work formalizes hallucination in natural language generation and proposes a zero-shot, task-aware detection framework that reframes detection as Natural Language Inference across DM, MT, and PG tasks. Using the SHROOM dataset with model-aware and model-agnostic settings, the authors show that entailment-based checks between model outputs and inputs or targets can effectively identify hallucinations with competitive accuracy ( MAw, MAg) while remaining computationally lightweight. They demonstrate that definition modelling benefits from single-task entailment checks, whereas paraphrase generation and machine translation require bidirectional semantic equivalence to flag hallucinations. The study provides concrete definitions, a practical methodology, and a framework adaptable to additional NLG tasks, advancing reliable evaluation and safety in LLM-based generation.

Abstract

In recent studies, the extensive utilization of large language models has underscored the importance of robust evaluation methodologies for assessing text generation quality and relevance to specific tasks. This has revealed a prevalent issue known as hallucination, an emergent condition in the model where generated text lacks faithfulness to the source and deviates from the evaluation criteria. In this study, we formally define hallucination and propose a framework for its quantitative detection in a zero-shot setting, leveraging our definition and the assumption that model outputs entail task and sample specific inputs. In detecting hallucinations, our solution achieves an accuracy of 0.78 in a model-aware setting and 0.61 in a model-agnostic setting. Notably, our solution maintains computational efficiency, requiring far less computational resources than other SOTA approaches, aligning with the trend towards lightweight and compressed models.
Paper Structure (11 sections, 8 tables)