Comparative Performance Evaluation of Large Language Models for Extracting Molecular Interactions and Pathway Knowledge

Gilchan Park; Byung-Jun Yoon; Xihaier Luo; Vanessa López-Marrero; Shinjae Yoo; Shantenu Jha

Comparative Performance Evaluation of Large Language Models for Extracting Molecular Interactions and Pathway Knowledge

Gilchan Park, Byung-Jun Yoon, Xihaier Luo, Vanessa López-Marrero, Shinjae Yoo, Shantenu Jha

TL;DR

This study systematically evaluates 15 open-source LLMs for extracting molecular interactions and pathway knowledge from biomedical text, focusing on PPIs, KEGG pathways affected by low-dose radiation, and gene regulatory relations. Using STRING, Negatome, KEGG, and INDRA databases, the authors compare model performance under QA-style prompts with varying shot counts and distributed GPU inference. Overall, larger and instruction-tuned models deliver superior performance on complex interaction tasks, with domain-specific models excelling in pathway-related predictions, though challenges remain for diverse gene/protein groups and highly correlated regulatory relations. The findings demonstrate the potential of AI-assisted knowledge discovery in systems biology while outlining practical directions for improving reliability, such as retrieval-augmented prompting and parameter-efficient fine-tuning.

Abstract

Background: Identification of the interactions and regulatory relations between biomolecules play pivotal roles in understanding complex biological systems and the mechanisms underlying diverse biological functions. However, the collection of such molecular interactions has heavily relied on expert curation in the past, making it labor-intensive and time-consuming. To mitigate these challenges, we propose leveraging the capabilities of large language models (LLMs) to automate genome-scale extraction of this crucial knowledge. Results: In this study, we investigate the efficacy of various LLMs in addressing biological tasks, such as the recognition of protein interactions, identification of genes linked to pathways affected by low-dose radiation, and the delineation of gene regulatory relationships. Overall, the larger models exhibited superior performance, indicating their potential for specific tasks that involve the extraction of complex interactions among genes and proteins. Although these models possessed detailed information for distinct gene and protein groups, they faced challenges in identifying groups with diverse functions and in recognizing highly correlated gene regulatory relationships. Conclusions: By conducting a comprehensive assessment of the state-of-the-art models using well-established molecular interaction and pathway databases, our study reveals that LLMs can identify genes/proteins associated with pathways of interest and predict their interactions to a certain extent. Furthermore, these models can provide important insights, marking a noteworthy stride toward advancing our understanding of biological systems through AI-assisted knowledge discovery.

Comparative Performance Evaluation of Large Language Models for Extracting Molecular Interactions and Pathway Knowledge

TL;DR

Abstract

Comparative Performance Evaluation of Large Language Models for Extracting Molecular Interactions and Pathway Knowledge

Authors

TL;DR

Abstract

Table of Contents

Figures (2)