HeMeNet: Heterogeneous Multichannel Equivariant Network for Protein Multitask Learning
Rong Han, Wenbing Huang, Lingxiao Luo, Xinyan Han, Jiaming Shen, Zhiqiang Zhang, Jun Zhou, Ting Chen
TL;DR
The paper tackles data sparsity in structure-based protein tasks by introducing Protein-MT, a six-task benchmark that combines LBA, PPA, EC, and GO data. It proposes HeMeNet, an $E(3)$-equivariant, heterogeneous multichannel GNN with a task-aware readout that enables joint learning across diverse inputs and tasks. Across single-task and multi-task settings, HeMeNet achieves state-of-the-art results on most tasks, with substantial cross-task gains in affinity predictions, illustrating effective knowledge transfer between structure-based properties and binding affinities. The work provides a scalable, generalist framework for multitask protein learning and paves the way for more integrated structure- and function-aware drug discovery pipelines.
Abstract
Understanding and leveraging the 3D structures of proteins is central to a variety of biological and drug discovery tasks. While deep learning has been applied successfully for structure-based protein function prediction tasks, current methods usually employ distinct training for each task. However, each of the tasks is of small size, and such a single-task strategy hinders the models' performance and generalization ability. As some labeled 3D protein datasets are biologically related, combining multi-source datasets for larger-scale multi-task learning is one way to overcome this problem. In this paper, we propose a neural network model to address multiple tasks jointly upon the input of 3D protein structures. In particular, we first construct a standard structure-based multi-task benchmark called Protein-MT, consisting of 6 biologically relevant tasks, including affinity prediction and property prediction, integrated from 4 public datasets. Then, we develop a novel graph neural network for multi-task learning, dubbed Heterogeneous Multichannel Equivariant Network (HeMeNet), which is E(3) equivariant and able to capture heterogeneous relationships between different atoms. Besides, HeMeNet can achieve task-specific learning via the task-aware readout mechanism. Extensive evaluations on our benchmark verify the effectiveness of multi-task learning, and our model generally surpasses state-of-the-art models.
