A Paragraph-level Multi-task Learning Model for Scientific Fact-Verification
Xiangci Li, Gully Burns, Nanyun Peng
TL;DR
The paper addresses automatic verification of scientific claims under misinformation by proposing a paragraph-level, multi-task model that jointly performs rationale selection and stance prediction. It employs a compact paragraph encoding strategy that feeds the entire claim–paragraph sequence into a single BERT model to produce contextually enriched sentence representations, combined with attention-based and KGAT variants for stance. Training uses a multi-task objective with scheduled sampling, negative sampling, and both FEVER pre-training and domain adaptation to cope with limited SciFact data, yielding strong results and ablation insights. The approach achieves state-of-the-art performance on the SciFact leaderboard, demonstrates the value of joint optimization and robust data augmentation, and highlights practical implications for scalable, domain-specific fact verification in scientific literature.
Abstract
Even for domain experts, it is a non-trivial task to verify a scientific claim by providing supporting or refuting evidence rationales. The situation worsens as misinformation is proliferated on social media or news websites, manually or programmatically, at every moment. As a result, an automatic fact-verification tool becomes crucial for combating the spread of misinformation. In this work, we propose a novel, paragraph-level, multi-task learning model for the SciFact task by directly computing a sequence of contextualized sentence embeddings from a BERT model and jointly training the model on rationale selection and stance prediction.
