Bridging AI and Science: Implications from a Large-Scale Literature Analysis of AI4Science
Yutong Xie, Yijun Pan, Hua Xu, Qiaozhu Mei
TL;DR
The paper tackles the gap between AI and science by constructing a large-scale, data-driven map of the AI4Science landscape. It uses LLMs to extract detailed scientific problems and AI methods from a broad set of top science and AI publications (2014–2024), clusters them semantically, and represents their connections as a bipartite graph. Through link prediction experiments, it demonstrates that AI4Science connections can be predicted and that new, previously underexplored links emerge, highlighting opportunities to broaden AI integration in science. The work provides a public dataset, code, and tools to foster interdisciplinary collaboration and accelerate discovery through deeper AI adoption across scientific domains.
Abstract
Artificial Intelligence has proven to be a transformative tool for advancing scientific research across a wide range of disciplines. However, a significant gap still exists between AI and scientific communities, limiting the full potential of AI methods in driving broad scientific discovery. Existing efforts in identifying and bridging this gap have often relied on qualitative examination of small samples of literature, offering a limited perspective on the broader AI4Science landscape. In this work, we present a large-scale analysis of the AI4Science literature, starting by using large language models to identify scientific problems and AI methods in publications from top science and AI venues. Leveraging this new dataset, we quantitatively highlight key disparities between AI methods and scientific problems, revealing substantial opportunities for deeper AI integration across scientific disciplines. Furthermore, we explore the potential and challenges of facilitating collaboration between AI and scientific communities through the lens of link prediction. Our findings and tools aim to promote more impactful interdisciplinary collaborations and accelerate scientific discovery through deeper and broader AI integration. Our code and dataset are available at: https://github.com/charles-pyj/Bridging-AI-and-Science.
