Table of Contents
Fetching ...

Zero-Shot Belief: A Hard Problem for LLMs

John Murzaku, Owen Rambow

TL;DR

Belief detection in text, including author and nested source beliefs, is shown to be hard for LLMs in zero-shot settings. The authors propose unified and hybrid zero-shot frameworks, with the Hybrid approach achieving new state-of-the-art on FactBank and revealing that Nested belief remains a major challenge. They also demonstrate transferability to ModaFact and provide detailed error analyses and ablations, highlighting both potential and limitations of current LLMs for structured belief reasoning. The work informs future directions for multilingual belief understanding and cost-conscious prompting strategies.

Abstract

We present two LLM-based approaches to zero-shot source-and-target belief prediction on FactBank: a unified system that identifies events, sources, and belief labels in a single pass, and a hybrid approach that uses a fine-tuned DeBERTa tagger for event detection. We show that multiple open-sourced, closed-source, and reasoning-based LLMs struggle with the task. Using the hybrid approach, we achieve new state-of-the-art results on FactBank and offer a detailed error analysis. Our approach is then tested on the Italian belief corpus ModaFact.

Zero-Shot Belief: A Hard Problem for LLMs

TL;DR

Belief detection in text, including author and nested source beliefs, is shown to be hard for LLMs in zero-shot settings. The authors propose unified and hybrid zero-shot frameworks, with the Hybrid approach achieving new state-of-the-art on FactBank and revealing that Nested belief remains a major challenge. They also demonstrate transferability to ModaFact and provide detailed error analyses and ablations, highlighting both potential and limitations of current LLMs for structured belief reasoning. The work informs future directions for multilingual belief understanding and cost-conscious prompting strategies.

Abstract

We present two LLM-based approaches to zero-shot source-and-target belief prediction on FactBank: a unified system that identifies events, sources, and belief labels in a single pass, and a hybrid approach that uses a fine-tuned DeBERTa tagger for event detection. We show that multiple open-sourced, closed-source, and reasoning-based LLMs struggle with the task. Using the hybrid approach, we achieve new state-of-the-art results on FactBank and offer a detailed error analysis. Our approach is then tested on the Italian belief corpus ModaFact.

Paper Structure

This paper contains 28 sections, 6 figures, 6 tables.

Figures (6)

  • Figure 1: Instruction for our zero-shot Unified Belief Annotation. The instruction for FactBank-style event factuality annotation consists of three parts: a brief fooSeaGreen!10 task description, detailed fooPeriwinkle!30 step-by-step instructions, and the fooSkyBlue!20 formatting structure. Our CoT instructions are shown in the end of the prompt (Step-by-Step Output).
  • Figure 2: Instruction for our Hybrid Belief Annotation. The instruction for FactBank-style event factuality annotation consists of three parts: a brief fooSeaGreen!10 task description, detailed fooPeriwinkle!30 step-by-step instructions, and the fooSkyBlue!20 formatting structure. Our CoT instructions are shown in the end of the prompt (Step-by-Step Output).
  • Figure 3: Oracle Source Normalization Prompt
  • Figure 4: Few Shot Source Normalization Prompt
  • Figure 5: FactBank Single-Token Event Identification Prompt
  • ...and 1 more figures