Enhancing Multimodal Protein Function Prediction Through Dual-Branch Dynamic Selection with Reconstructive Pre-Training

Xiaoling Luo; Peng Chen; Chengliang Liu; Xiaopeng Jin; Jie Wen; Yumeng Liu; Junsong Wang

Enhancing Multimodal Protein Function Prediction Through Dual-Branch Dynamic Selection with Reconstructive Pre-Training

Xiaoling Luo, Peng Chen, Chengliang Liu, Xiaopeng Jin, Jie Wen, Yumeng Liu, Junsong Wang

TL;DR

This work tackles the challenge of predicting protein functions by integrating multimodal data (sequence, spatial structure, and interactions) through a two-stage learning framework. It introduces reconstructive pre-training via PSSI and PSeI encoder–decoder modules to mine low-semantic, fine-grained features, followed by a dual-branch architecture with Bidirectional Interaction Module (BInM) and Dynamic Selection Module (DSM) to enable deep inter-modal learning and adaptive feature selection. The approach yields significant gains over both unimodal and existing multimodal methods across GO branches (BPO, MFO, CCO), with ablation and feature-analytic evidence underscoring the contributions of BInM, DSM, and pretraining. The model’s ability to extract rich multimodal representations and dynamically tailor feature combinations suggests practical impact for scalable and accurate protein function annotation in biological datasets, particularly when data noise and heterogeneity are present.

Abstract

Multimodal protein features play a crucial role in protein function prediction. However, these features encompass a wide range of information, ranging from structural data and sequence features to protein attributes and interaction networks, making it challenging to decipher their complex interconnections. In this work, we propose a multimodal protein function prediction method (DSRPGO) by utilizing dynamic selection and reconstructive pre-training mechanisms. To acquire complex protein information, we introduce reconstructive pre-training to mine more fine-grained information with low semantic levels. Moreover, we put forward the Bidirectional Interaction Module (BInM) to facilitate interactive learning among multimodal features. Additionally, to address the difficulty of hierarchical multi-label classification in this task, a Dynamic Selection Module (DSM) is designed to select the feature representation that is most conducive to current protein function prediction. Our proposed DSRPGO model improves significantly in BPO, MFO, and CCO on human datasets, thereby outperforming other benchmark models.

Enhancing Multimodal Protein Function Prediction Through Dual-Branch Dynamic Selection with Reconstructive Pre-Training

TL;DR

Abstract

Enhancing Multimodal Protein Function Prediction Through Dual-Branch Dynamic Selection with Reconstructive Pre-Training

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)