Deciphering Scientific Collaboration in Biomedical LLM Research: Dynamics, Institutional Participation, and Resource Disparities
Lingyao Li, Zhijie Duan, Xuexin Li, Xiaoran Xu, Zhaoqian Xue, Siyuan Ma, Jin Jin
TL;DR
This paper investigates how LLMs reshape collaboration in biomedical research by analyzing 5,674 PubMed-indexed LLM-related papers alongside ML and general biomedical controls. It employs Shannon entropy to quantify collaboration diversity across institutions, disciplines, and countries, and uses network analysis to identify hub and bridging entities, while NIH FY2024 funding serves as a proxy for institutional resources. The findings show increasing collaboration diversity and a declining share of CS/AI authors in LLM work, but a centralized structure anchored by a core set of institutions and disciplines; resource levels strongly relate to output and influence, with strategic collaborations enabling resource-constrained institutions to achieve greater visibility. These results highlight both democratizing trends and persistent resource-based disparities, underscoring the importance of targeted collaboration strategies to promote equitable advancement in LLM-driven biomedicine.
Abstract
Large language models (LLMs) are increasingly transforming biomedical discovery and clinical innovation, yet their impact extends far beyond algorithmic revolution-LLMs are restructuring how scientific collaboration occurs, who participates, and how resources shape innovation. Despite this profound transformation, how this rapid technological shift is reshaping the structure and equity of scientific collaboration in biomedical LLM research remains largely unknown. By analyzing 5,674 LLM-related biomedical publications from PubMed, we examine how collaboration diversity evolves over time, identify institutions and disciplines that anchor and bridge collaboration networks, and assess how resource disparities underpin research performance. We find that collaboration diversity has grown steadily, with a decreasing share of Computer Science and Artificial Intelligence authors, suggesting that LLMs are lowering technical barriers for biomedical investigators. Network analysis reveals central institutions, including Stanford University and Harvard Medical School, and bridging disciplines such as Medicine and Computer Science that anchor collaborations in this field. Furthermore, biomedical research resources are strongly linked to research performance, with high-performing resource-constrained institutions exhibiting larger collaboration volume with the top 1% most connected institutions in the network. Together, these findings reveal a complex landscape, where democratizing trends coexist with a persistent, resource-driven hierarchy, highlighting the critical role of strategic collaboration in this evolving field.
