Enhanced Protein Intrinsic Disorder Prediction Through Dual-View Multiscale Features and Multi-objective Evolutionary Algorithm

Shaokuan Wang; Pengshan Cui; Yining Qian; An-Yang Lu; Xianpeng Wang

Enhanced Protein Intrinsic Disorder Prediction Through Dual-View Multiscale Features and Multi-objective Evolutionary Algorithm

Shaokuan Wang, Pengshan Cui, Yining Qian, An-Yang Lu, Xianpeng Wang

TL;DR

D2MOE combines the feature extraction capabilities of deep learning with the global search advantages of evolutionary algorithms, enabling efficient feature integration without manual design, and providing a robust computational tool for protein disorder prediction.

Abstract

Intrinsically disordered regions of proteins play a crucial role in cell signaling and drug discovery. However, their high structural flexibility makes accurate residue-level prediction challenging. Existing methods often rely on single-view representations or rigid manual fusion strategies, which fail to effectively balance the complex interplay between local amino acid preferences and long-range sequence patterns. To address these limitations, we propose D2MOE, a Dual-View Multiscale Features and Multi-objective Evolutionary Algorithm, which consists of two stages. First, a dual-view multiscale feature extraction method is introduced. This method integrates evolutionary views with deep semantic views and employs multiscale extractors to capture structural information across diverse receptive fields. Second, a multi-objective evolutionary algorithm is designed to adaptively discover optimal fusion architectures. By co-evolving discrete feature selection and continuous fusion weights, the algorithm adaptively explores optimal cross-feature architectures to enhance predictive accuracy while maintaining model compactness. Experimental results across three benchmark datasets demonstrate that D2MOE consistently outperforms state-of-the-art methods. D2MOE combines the feature extraction capabilities of deep learning with the global search advantages of evolutionary algorithms, enabling efficient feature integration without manual design, and providing a robust computational tool for protein disorder prediction.

Enhanced Protein Intrinsic Disorder Prediction Through Dual-View Multiscale Features and Multi-objective Evolutionary Algorithm

TL;DR

Abstract

Paper Structure (31 sections, 8 equations, 5 figures, 5 tables, 2 algorithms)

This paper contains 31 sections, 8 equations, 5 figures, 5 tables, 2 algorithms.

Introduction
Related Work
Dual-View Multiscale Features Strategy
Multi-objective Evolutionary Algorithm
Methodology
Dual-View Multiscale Features
Dual-View Representation
Semantic view from ProtT5 embeddings
Evolutionary view from HHblits-HMM profiles
Multiscale Feature Extraction
Multi-objective Evolutionary Algorithm
Encoding
Decoding
Workflow of the MOEA
Population initialization
...and 16 more sections

Figures (5)

Figure 1: Overview of D2MOE. (A) Dual-view multiscale feature extraction. Protein sequences are mapped into semantic (ProtT5) and evolutionary (HMM) views, which are then passed through multiscale CNN and RNN extractors to generate complementary feature representations. (B) Adaptive feature fusion via (C) MOEA. The 12 candidate features are encoded and evolved using a multi-objective algorithm to simultaneously optimize feature selection, fusion operators (Add, Max, Min, Mul), and fusion weights.
Figure 2: Performance comparison of D2MOE and seven representative predictors on three benchmark datasets. (a--c) Quantitative results on TS115, CASP12 and CB513, reporting MCC and F1 (bars) together with AUC and AUPR (lines). (d,e) ROC and PR curves on CASP12; AUC and AUPR are reported in the legends.
Figure 3: Validation of the dual-view design on TS115, CASP12 and CB513. D2MOE denotes the dual-view model that integrates ProtT5 embeddings with HHblits-HMM profiles, whereas T5 and HMM use the semantic view or the evolutionary view alone. Bars report MCC and F1 (left axis), and lines report AUC and AUPR (right axis).
Figure 4: Radar comparison of five fusion schemes (Add, Mul, Max, Min, and D2MOE) on (a) TS115, (b) CASP12, and (c) CB513 in terms of MCC, AUC, AUPR, and F1 (larger radius indicates better performance). D2MOE encloses all fixed operators on TS115 and CASP12, and remains dominant on CB513 except that Min is slightly higher on AUPR.
Figure 5: The best fused architecture using MOEA.

Enhanced Protein Intrinsic Disorder Prediction Through Dual-View Multiscale Features and Multi-objective Evolutionary Algorithm

TL;DR

Abstract

Enhanced Protein Intrinsic Disorder Prediction Through Dual-View Multiscale Features and Multi-objective Evolutionary Algorithm

Authors

TL;DR

Abstract

Table of Contents

Figures (5)