MAPLE: Micro Analysis of Pairwise Language Evolution for Few-Shot Claim Verification

Xia Zeng; Arkaitz Zubiaga

MAPLE: Micro Analysis of Pairwise Language Evolution for Few-Shot Claim Verification

Xia Zeng, Arkaitz Zubiaga

TL;DR

MAPLE tackles the problem of few-shot claim verification by exploiting language-transition signals during seq2seq training and using a small model with unlabeled claim–evidence pairs. It introduces SemSim, a semantic similarity-based measure of pairwise language evolution, and trains a logistic classifier on SemSim features derived from $2*d*e$ generated mutations across two training directions. Across FEVER, Climate FEVER, and SciFact, MAPLE outperforms SEED, PET, and LLaMA 2 baselines, demonstrating strong few-shot performance with minimal labeled data and computational resources. The approach offers practical benefits for real-world fact-checking by enabling efficient deployment, interpretability, and robustness to noisy evidence, with clearly defined avenues for self-supervised extension and broader NLG metric applicability.

Abstract

Claim verification is an essential step in the automated fact-checking pipeline which assesses the veracity of a claim against a piece of evidence. In this work, we explore the potential of few-shot claim verification, where only very limited data is available for supervision. We propose MAPLE (Micro Analysis of Pairwise Language Evolution), a pioneering approach that explores the alignment between a claim and its evidence with a small seq2seq model and a novel semantic measure. Its innovative utilization of micro language evolution path leverages unlabelled pairwise data to facilitate claim verification while imposing low demand on data annotations and computing resources. MAPLE demonstrates significant performance improvements over SOTA baselines SEED, PET and LLaMA 2 across three fact-checking datasets: FEVER, Climate FEVER, and SciFact. Data and code are available here: https://github.com/XiaZeng0223/MAPLE

MAPLE: Micro Analysis of Pairwise Language Evolution for Few-Shot Claim Verification

TL;DR

generated mutations across two training directions. Across FEVER, Climate FEVER, and SciFact, MAPLE outperforms SEED, PET, and LLaMA 2 baselines, demonstrating strong few-shot performance with minimal labeled data and computational resources. The approach offers practical benefits for real-world fact-checking by enabling efficient deployment, interpretability, and robustness to noisy evidence, with clearly defined avenues for self-supervised extension and broader NLG metric applicability.

Abstract

Paper Structure (38 sections, 6 figures, 9 tables)

This paper contains 38 sections, 6 figures, 9 tables.

Introduction
Related Work
Few-Shot Learning for Claim Verification
Natural Language Generation (NLG) Metrics
Understanding Language Evolution
Methodology
(1) In-domain seq2seq training.
(2) SemSim transformation.
(3) Logistic classifier training with few-shot labeled data.
Experiments
Datasets
FEVER
cFEVER
SciFact
Baselines
...and 23 more sections

Figures (6)

Figure 1: MAPLE for claim verification. (1) In-domain seq2seq training. With LoRA, a T5-small model is trained on claim-to-evidence task for $e$ epochs using the $d$ unlabelled claim-evidence pairs from the data pool. At the end of each training epoch $j$, model inference is performed on each instance $i$ to generate a mutation $mutation\_c2e\_i$. This process is repeated on evidence-to-claim setting. In total this step produces $2*d*e$ triples that consist of a claim $c$, an associated piece of evidence $e$ and a generated mutation $m$. (2) SemSim transformation. Each triple is grouped into three pairs including claim-evidence pair $c-e$, claim-mutation pair $c-m$ and evidence-mutation pair $e-m$. 'Semsim' scores are obtained for each pair by calculating the cosine similarity score based on corresponding sentence embeddings. (3) Logistic classifier training with few-shot labelled data. A logistic classifier is trained on labelled data where the transformed 'SemSim' scores are used input features to predict veracity labels.
Figure 2: F1 performance within 5 shots.
Figure 3: Comparison of MAPLE performance using different training algorithms for in-domain seq2seq training. The label "LoRA" represents parameter-efficient training method Low-Rank Adaptation, "SFT" indicates supervised fine-tuning and "NLPO" refers to reinforcement learning with the NLPO policy.
Figure 4: Comparison of MAPLE performance using the proposed 'SemSim' metric and alternative metrics to measure micro pairwise language evolution.
Figure 5: Example signals captured for classification, using the 'SemSim' score for target-mutation pairs on the test.
...and 1 more figures

MAPLE: Micro Analysis of Pairwise Language Evolution for Few-Shot Claim Verification

TL;DR

Abstract

MAPLE: Micro Analysis of Pairwise Language Evolution for Few-Shot Claim Verification

Authors

TL;DR

Abstract

Table of Contents

Figures (6)