Table of Contents
Fetching ...

HerO at AVeriTeC: The Herd of Open Large Language Models for Verifying Real-World Claims

Yejun Yoon, Jaeyoon Jung, Seunghyun Yoon, Kunwoo Park

TL;DR

A system that only employs publicly available large language models (LLMs) for each step of automated fact-checking, dubbed the Herd of Open LLMs for verifying real-world claims (HerO), which achieved 2nd place on the leaderboard, suggesting the potential of open LLMs for verifying real-world claims.

Abstract

To tackle the AVeriTeC shared task hosted by the FEVER-24, we introduce a system that only employs publicly available large language models (LLMs) for each step of automated fact-checking, dubbed the Herd of Open LLMs for verifying real-world claims (HerO). For evidence retrieval, a language model is used to enhance a query by generating hypothetical fact-checking documents. We prompt pretrained and fine-tuned LLMs for question generation and veracity prediction by crafting prompts with retrieved in-context samples. HerO achieved 2nd place on the leaderboard with the AVeriTeC score of 0.57, suggesting the potential of open LLMs for verifying real-world claims. For future research, we make our code publicly available at https://github.com/ssu-humane/HerO.

HerO at AVeriTeC: The Herd of Open Large Language Models for Verifying Real-World Claims

TL;DR

A system that only employs publicly available large language models (LLMs) for each step of automated fact-checking, dubbed the Herd of Open LLMs for verifying real-world claims (HerO), which achieved 2nd place on the leaderboard, suggesting the potential of open LLMs for verifying real-world claims.

Abstract

To tackle the AVeriTeC shared task hosted by the FEVER-24, we introduce a system that only employs publicly available large language models (LLMs) for each step of automated fact-checking, dubbed the Herd of Open LLMs for verifying real-world claims (HerO). For evidence retrieval, a language model is used to enhance a query by generating hypothetical fact-checking documents. We prompt pretrained and fine-tuned LLMs for question generation and veracity prediction by crafting prompts with retrieved in-context samples. HerO achieved 2nd place on the leaderboard with the AVeriTeC score of 0.57, suggesting the potential of open LLMs for verifying real-world claims. For future research, we make our code publicly available at https://github.com/ssu-humane/HerO.

Paper Structure

This paper contains 15 sections, 1 equation, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Inference pipeline of our system
  • Figure 2: An example of the instruction prompt used for HyDE-FC and its output. The bold text is the instruction, the italic text is a claim, and the blue text indicates the model output.
  • Figure 3: An example of instruction prompt and its output for question generation. The bold text indicates the instruction, the italic text is a claim, the gray text is retrieved in-context samples, and the blue text indicates the model output.
  • Figure 4: An example of instruction prompt and its output for veracity prediction. The bold text indicates the instruction, the italic text is a claim, the gray text is retrieved QA pairs, and the blue text is the model output.