Enabling AI Scientists to Recognize Innovation: A Domain-Agnostic Algorithm for Assessing Novelty
Yao Wang, Mingxuan Cui, Arthur Jiang, Jun Yan
TL;DR
The paper addresses automated novelty assessment for research ideas in AI-driven scientific discovery and introduces Relative Neighbor Density (RND), a domain-agnostic metric that compares an idea's local neighbor density to those of adjacent literature to quantify novelty. It leverages large-scale semantic embeddings from PubMed and ArXiv and a label-free validation framework to demonstrate state-of-the-art AUROC performance in computer science and biomedical domains (e.g., 0.820 and 0.765) with robust cross-domain accuracy (0.795). RND outperforms both LLM-based judges and existing density metrics, particularly in cross-domain scenarios, by focusing on the distribution of neighbor densities rather than absolute density alone. The work discusses its limitations (embedding/database quality and test-set approximations) and outlines future directions including integration into AI research workflows and reinforcement learning reward signals to guide novel idea generation.
Abstract
In the pursuit of Artificial General Intelligence (AGI), automating the generation and evaluation of novel research ideas is a key challenge in AI-driven scientific discovery. This paper presents Relative Neighbor Density (RND), a domain-agnostic algorithm for novelty assessment in research ideas that overcomes the limitations of existing approaches by comparing an idea's local density with its adjacent neighbors' densities. We first developed a scalable methodology to create test set without expert labeling, addressing a fundamental challenge in novelty assessment. Using these test sets, we demonstrate that our RND algorithm achieves state-of-the-art (SOTA) performance in computer science (AUROC=0.820) and biomedical research (AUROC=0.765) domains. Most significantly, while SOTA models like Sonnet-3.7 and existing metrics show domain-specific performance degradation, RND maintains consistent accuracies across domains by its domain-invariant property, outperforming all benchmarks by a substantial margin (0.795 v.s. 0.597) on cross-domain evaluation. These results validate RND as a generalizable solution for automated novelty assessment in scientific research.
