Transforming Role Classification in Scientific Teams Using LLMs and Advanced Predictive Analytics
Wonduk Seo, Yi Bu
TL;DR
This work tackles the challenge of classifying author roles in scientific teams beyond self-reports and static clustering by employing large language models (LLMs) and predictive analytics. It combines few-shot prompting on GPT-4 and open-source LLMs to produce fine-grained role labels (Leadership, Direct Support, Indirect Support), then trains a scalable dense neural network on ten OpenAlex-derived features to classify roles efficiently at scale. The study reports GPT-4 as the most accurate LLM for role labeling, achieves an F1 of approximately 0.76 with the predictive model, and uses SHAP to reveal that Probability of Leading and related features are pivotal for leadership identification. The approach promises scalable, context-aware analysis of team dynamics and leadership distribution, with implications for monitoring collaboration patterns and informing research management, while recognizing limitations in accessibility and data coverage of OpenAlex.
Abstract
Scientific team dynamics are critical in determining the nature and impact of research outputs. However, existing methods for classifying author roles based on self-reports and clustering lack comprehensive contextual analysis of contributions. Thus, we present a transformative approach to classifying author roles in scientific teams using advanced large language models (LLMs), which offers a more refined analysis compared to traditional clustering methods. Specifically, we seek to complement and enhance these traditional methods by utilizing open source and proprietary LLMs, such as GPT-4, Llama3 70B, Llama2 70B, and Mistral 7x8B, for role classification. Utilizing few-shot prompting, we categorize author roles and demonstrate that GPT-4 outperforms other models across multiple categories, surpassing traditional approaches such as XGBoost and BERT. Our methodology also includes building a predictive deep learning model using 10 features. By training this model on a dataset derived from the OpenAlex database, which provides detailed metadata on academic publications -- such as author-publication history, author affiliation, research topics, and citation counts -- we achieve an F1 score of 0.76, demonstrating robust classification of author roles.
