SC4ANM: Identifying Optimal Section Combinations for Automated Novelty Prediction in Academic Papers
Wenqing Wu, Chengzhi Zhang, Tong Bao, Yi Zhao
TL;DR
This study tackles automatic novelty assessment by analyzing how different paper sections signal novelty. Using peer-review novelty scores from ICLR 2022–2023 as ground truth, it develops SC4ANM to predict novelty from various section combinations and evaluates both pretrained language models and large language models. The results show that three-section inputs including Introduction—especially Introduction+Results+Discussion—provide the strongest signals for PLMs (with SciBERT achieving up to about 0.68 accuracy), while LLMs under zero-shot conditions underperform in accuracy but can exhibit meaningful correlations, particularly GPT-4o. Compared to a traditional citation-based baseline, the proposed approach with summary and content from multiple sections yields superior accuracy, F1, and correlations, highlighting the value of leveraging broader textual information for novelty assessment. The work provides practical guidance for automated review systems and suggests avenues for cross-domain extension and model integration to improve automated novelty scoring.
Abstract
Novelty is a core component of academic papers, and there are multiple perspectives on the assessment of novelty. Existing methods often focus on word or entity combinations, which provide limited insights. The content related to a paper's novelty is typically distributed across different core sections, e.g., Introduction, Methodology and Results. Therefore, exploring the optimal combination of sections for evaluating the novelty of a paper is important for advancing automated novelty assessment. In this paper, we utilize different combinations of sections from academic papers as inputs to drive language models to predict novelty scores. We then analyze the results to determine the optimal section combinations for novelty score prediction. We first employ natural language processing techniques to identify the sectional structure of academic papers, categorizing them into introduction, methods, results, and discussion (IMRaD). Subsequently, we used different combinations of these sections (e.g., introduction and methods) as inputs for pretrained language models (PLMs) and large language models (LLMs), employing novelty scores provided by human expert reviewers as ground truth labels to obtain prediction results. The results indicate that using introduction, results and discussion is most appropriate for assessing the novelty of a paper, while the use of the entire text does not yield significant results. Furthermore, based on the results of the PLMs and LLMs, the introduction and results appear to be the most important section for the task of novelty score prediction. The code and dataset for this paper can be accessed at https://github.com/njust-winchy/SC4ANM.
