Table of Contents
Fetching ...

SHROOM-INDElab at SemEval-2024 Task 6: Zero- and Few-Shot LLM-Based Classification for Hallucination Detection

Bradley P. Allen, Fina Polat, Paul Groth

TL;DR

SHROOM-INDElab presents a zero- and few-shot LLM-based classifier for hallucination detection in SemEval-2024 Task 6, built on task-, role-, and concept-definition prompts and enhanced by Self-Adaptive Prompting for example generation. It uses temperature sampling to estimate hallucination probabilities and a diversity–consistency trade-off via embeddings to select training examples, achieving strong validation and test performance and substantial agreement with human labellers. The system ranks 4th in the model-agnostic and 6th in the model-aware tracks, demonstrating that carefully structured prompts can yield robust, model-agnostic hallucination detectors that align with crowd judgments. The findings underscore the value of explicit concept definitions and adaptive prompting strategies for reliable LLM-based hallucination detection across tasks, with avenues for further improvement in example selection and rationale generation.

Abstract

We describe the University of Amsterdam Intelligent Data Engineering Lab team's entry for the SemEval-2024 Task 6 competition. The SHROOM-INDElab system builds on previous work on using prompt programming and in-context learning with large language models (LLMs) to build classifiers for hallucination detection, and extends that work through the incorporation of context-specific definition of task, role, and target concept, and automated generation of examples for use in a few-shot prompting approach. The resulting system achieved fourth-best and sixth-best performance in the model-agnostic track and model-aware tracks for Task 6, respectively, and evaluation using the validation sets showed that the system's classification decisions were consistent with those of the crowd-sourced human labellers. We further found that a zero-shot approach provided better accuracy than a few-shot approach using automatically generated examples. Code for the system described in this paper is available on Github.

SHROOM-INDElab at SemEval-2024 Task 6: Zero- and Few-Shot LLM-Based Classification for Hallucination Detection

TL;DR

SHROOM-INDElab presents a zero- and few-shot LLM-based classifier for hallucination detection in SemEval-2024 Task 6, built on task-, role-, and concept-definition prompts and enhanced by Self-Adaptive Prompting for example generation. It uses temperature sampling to estimate hallucination probabilities and a diversity–consistency trade-off via embeddings to select training examples, achieving strong validation and test performance and substantial agreement with human labellers. The system ranks 4th in the model-agnostic and 6th in the model-aware tracks, demonstrating that carefully structured prompts can yield robust, model-agnostic hallucination detectors that align with crowd judgments. The findings underscore the value of explicit concept definitions and adaptive prompting strategies for reliable LLM-based hallucination detection across tasks, with avenues for further improvement in example selection and rationale generation.

Abstract

We describe the University of Amsterdam Intelligent Data Engineering Lab team's entry for the SemEval-2024 Task 6 competition. The SHROOM-INDElab system builds on previous work on using prompt programming and in-context learning with large language models (LLMs) to build classifiers for hallucination detection, and extends that work through the incorporation of context-specific definition of task, role, and target concept, and automated generation of examples for use in a few-shot prompting approach. The resulting system achieved fourth-best and sixth-best performance in the model-agnostic track and model-aware tracks for Task 6, respectively, and evaluation using the validation sets showed that the system's classification decisions were consistent with those of the crowd-sourced human labellers. We further found that a zero-shot approach provided better accuracy than a few-shot approach using automatically generated examples. Code for the system described in this paper is available on Github.
Paper Structure (12 sections, 2 equations, 6 figures, 4 tables, 1 algorithm)

This paper contains 12 sections, 2 equations, 6 figures, 4 tables, 1 algorithm.

Figures (6)

  • Figure 1: SHROOM-INDElab system workflow.
  • Figure 2: Example prompt for a Stage 2 classifier, given a Definition Modeling task data point from one of the SHROOM datasets, and using 1 example per label.
  • Figure 3: Classifier performance by temperature.
  • Figure 4: Classifier performance by examples per label.
  • Figure 5: Classifier performance by samples per query.
  • ...and 1 more figures