ProtRLSearch: A Multi-Round Multimodal Protein Search Agent with Large Language Models Trained via Reinforcement Learning

Congying Liu; Taihao Li; Ming Huang; Xingyuan Wei; Peipei Liu; Yiqing Shen; Yanxu Mao; Tiehan Cui

ProtRLSearch: A Multi-Round Multimodal Protein Search Agent with Large Language Models Trained via Reinforcement Learning

Congying Liu, Taihao Li, Ming Huang, Xingyuan Wei, Peipei Liu, Yiqing Shen, Yanxu Mao, Tiehan Cui

TL;DR

ProtRLSearch is proposed, a multi-round protein search agent trained with multi-dimensional reward based RL, which jointly leverages protein sequence and text as multimodal inputs during real-time search to produce high quality reports.

Abstract

Protein analysis tasks arising in healthcare settings often require accurate reasoning under protein sequence constraints, involving tasks such as functional interpretation of disease-related variants, protein-level analysis for clinical research, and similar scenarios. To address such tasks, search agents are introduced to search protein-related information, providing support for disease-related variant analysis and protein function reasoning in protein-centric inference. However, such search agents are mostly limited to single-round, text-only modality search, which prevents the protein sequence modality from being incorporated as a multimodal input into the search decision-making process. Meanwhile, their reliance on reinforcement learning (RL) supervision that focuses solely on the final answer results in a lack of search process constraints, making deviations in keyword selection and reasoning directions difficult to identify and correct in a timely manner. To address these limitations, we propose ProtRLSearch, a multi-round protein search agent trained with multi-dimensional reward based RL, which jointly leverages protein sequence and text as multimodal inputs during real-time search to produce high quality reports. To evaluate the ability of models to integrate protein sequence information and text-based multimodal inputs in realistic protein query settings, we construct ProtMCQs, a benchmark of 3,000 multiple choice questions (MCQs) organized into three difficulty levels. The benchmark evaluates protein query tasks that range from sequence constrained reasoning about protein function and phenotype changes to comprehensive protein reasoning that integrates multi-dimensional sequence features with signal pathways and regulatory networks.

ProtRLSearch: A Multi-Round Multimodal Protein Search Agent with Large Language Models Trained via Reinforcement Learning

TL;DR

Abstract

Paper Structure (15 sections, 2 equations, 3 figures, 3 tables)

This paper contains 15 sections, 2 equations, 3 figures, 3 tables.

Introduction
Related Works
Search Agent for Protein Query Tasks
Reward Design in RL for Protein Tasks with LLMs
Methods
Multimodal Protein Sequence and Text Representation on LLM Backbone
Multi-Round Search with Structured Outputs
Multi-Dimensional Reward Design
Training
Benchmark Design
EXPERIMENT
Implementation Details
Main Results
Ablation Study
Conclusion

Figures (3)

Figure 1: A comparison between single-round search methods and the proposed approach, illustrating their differences in the search process. Red annotations indicate error cases such as missing keywords, while green annotations denote correct results.
Figure 2: The overall agent encompasses multimodal encoding, multi-round search trained via RL, and the design of multi-dimensional rewards.
Figure 3: Overview of the constructed training dataset

ProtRLSearch: A Multi-Round Multimodal Protein Search Agent with Large Language Models Trained via Reinforcement Learning

TL;DR

Abstract

ProtRLSearch: A Multi-Round Multimodal Protein Search Agent with Large Language Models Trained via Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (3)