Graph-level Protein Representation Learning by Structure Knowledge Refinement

Ge Wang; Zelin Zang; Jiangbin Zheng; Jun Xia; Stan Z. Li

Graph-level Protein Representation Learning by Structure Knowledge Refinement

Ge Wang, Zelin Zang, Jiangbin Zheng, Jun Xia, Stan Z. Li

TL;DR

This paper proposes a novel framework called Structure Knowledge Refinement (SKR) which uses data structure to determine the probability of whether a pair is positive or negative, and proposes an augmentation strategy that naturally preserves the semantic meaning of the original data and is compatible with the SKR framework.

Abstract

This paper focuses on learning representation on the whole graph level in an unsupervised manner. Learning graph-level representation plays an important role in a variety of real-world issues such as molecule property prediction, protein structure feature extraction, and social network analysis. The mainstream method is utilizing contrastive learning to facilitate graph feature extraction, known as Graph Contrastive Learning (GCL). GCL, although effective, suffers from some complications in contrastive learning, such as the effect of false negative pairs. Moreover, augmentation strategies in GCL are weakly adaptive to diverse graph datasets. Motivated by these problems, we propose a novel framework called Structure Knowledge Refinement (SKR) which uses data structure to determine the probability of whether a pair is positive or negative. Meanwhile, we propose an augmentation strategy that naturally preserves the semantic meaning of the original data and is compatible with our SKR framework. Furthermore, we illustrate the effectiveness of our SKR framework through intuition and experiments. The experimental results on the tasks of graph-level classification demonstrate that our SKR framework is superior to most state-of-the-art baselines.

Graph-level Protein Representation Learning by Structure Knowledge Refinement

TL;DR

Abstract

Paper Structure (12 sections, 6 equations, 5 figures, 3 tables, 1 algorithm)

This paper contains 12 sections, 6 equations, 5 figures, 3 tables, 1 algorithm.

Introduction
Related Work
Framework
Architecture of SKR
Augmentation Strategy of SKR
Objective function of SKR
Comparison of SKR and GCL
Experiments
Datasets and Settings
Results and Observations
Sensitivity Analysis and Ablation Study
Conclusion and Future Work

Figures (5)

Figure 1: The framework of Structure Knowledge Refinement (SKR). Graph-level representations in semantic space are derived from graph data in original space by Graph Isomorphism Network (GIN), and augmented graph-level representations are generated by our semantic preserving augmentation strategy. Then semantic-space structure knowledge is obtained by structure knowledge extractor, and fuzzy cross-entropy is used to refine data structure in embedding space to derive better representations by passing semantic-space structure knowledge into embedding space.
Figure 2: Comparison of SKR and GCL.
Figure 3: Sensitivity analysis on hyperparameter $\alpha$ in Dirichlet distribution on different datasets
Figure 4: Ablation study on Dirichlet pooling on different datasets (with Dirichlet pooling vs without Dirichlet pooling)
Figure 5: Ablation study on fuzzy cross-entropy on REDDIT-B dataset (Fuzzy cross-entropy vs Normal cross-entropy)

Graph-level Protein Representation Learning by Structure Knowledge Refinement

TL;DR

Abstract

Graph-level Protein Representation Learning by Structure Knowledge Refinement

Authors

TL;DR

Abstract

Table of Contents

Figures (5)