Table of Contents
Fetching ...

Hierarchical Molecular Representation Learning via Fragment-Based Self-Supervised Embedding Prediction

Jiele Wu, Haozhe Ma, Zhihan Guo, Thanh Vinh Vo, Tze Yun Leong

TL;DR

This work proposes Graph Semantic Predictive Network (GraSPNet), a hierarchical self-supervised framework that explicitly models both atomic-level and fragment-level semantics and enables GraSPNet to learn multi-resolution structural information that is both expressive and transferable.

Abstract

Graph self-supervised learning (GSSL) has demonstrated strong potential for generating expressive graph embeddings without the need for human annotations, making it particularly valuable in domains with high labeling costs such as molecular graph analysis. However, existing GSSL methods mostly focus on node- or edge-level information, often ignoring chemically relevant substructures which strongly influence molecular properties. In this work, we propose Graph Semantic Predictive Network (GraSPNet), a hierarchical self-supervised framework that explicitly models both atomic-level and fragment-level semantics. GraSPNet decomposes molecular graphs into chemically meaningful fragments without predefined vocabularies and learns node- and fragment-level representations through multi-level message passing with masked semantic prediction at both levels. This hierarchical semantic supervision enables GraSPNet to learn multi-resolution structural information that is both expressive and transferable. Extensive experiments on multiple molecular property prediction benchmarks demonstrate that GraSPNet learns chemically meaningful representations and consistently outperforms state-of-the-art GSSL methods in transfer learning settings.

Hierarchical Molecular Representation Learning via Fragment-Based Self-Supervised Embedding Prediction

TL;DR

This work proposes Graph Semantic Predictive Network (GraSPNet), a hierarchical self-supervised framework that explicitly models both atomic-level and fragment-level semantics and enables GraSPNet to learn multi-resolution structural information that is both expressive and transferable.

Abstract

Graph self-supervised learning (GSSL) has demonstrated strong potential for generating expressive graph embeddings without the need for human annotations, making it particularly valuable in domains with high labeling costs such as molecular graph analysis. However, existing GSSL methods mostly focus on node- or edge-level information, often ignoring chemically relevant substructures which strongly influence molecular properties. In this work, we propose Graph Semantic Predictive Network (GraSPNet), a hierarchical self-supervised framework that explicitly models both atomic-level and fragment-level semantics. GraSPNet decomposes molecular graphs into chemically meaningful fragments without predefined vocabularies and learns node- and fragment-level representations through multi-level message passing with masked semantic prediction at both levels. This hierarchical semantic supervision enables GraSPNet to learn multi-resolution structural information that is both expressive and transferable. Extensive experiments on multiple molecular property prediction benchmarks demonstrate that GraSPNet learns chemically meaningful representations and consistently outperforms state-of-the-art GSSL methods in transfer learning settings.
Paper Structure (29 sections, 9 equations, 7 figures, 9 tables)

This paper contains 29 sections, 9 equations, 7 figures, 9 tables.

Figures (7)

  • Figure 1: An example of hierarchical representation learning on molecular graph. The molecule is represented as a string-based notations (SMILES) and encoded at three semantic levels—node (atoms), fragment (e.g., functional groups), and graph to support various downstream tasks.
  • Figure 2: Illustration of our graph fragmentation process. The left figure shows graph $G_1$ and $G_2$ which can not be distinguished by WL test while higher-level fragment graph $F_1$ and $F_2$ exhibit different connections that can be distinguished by WL. The right figure shows an example of our graph fragmentation. The ring is first selected, followed by the extraction of multiple paths. The articulation points are designated as unique fragment types to prevent cycles in the fragment graph.
  • Figure 3: Overview of the Graph Semantic Predictive Network (GraSPNet) framework. The original molecule is fragmented to form a higher-level fragment graph. Masked node and fragment graphs are input into the context encoder, while the target encoder processes the original unmasked graphs. The predictor uses context representations to predict node and fragment embeddings, and the loss minimizes the distance between the prediction and the target encoder’s representations.
  • Figure 4: Performance of incorporating fragment information after different GINE layers in the context encoder.
  • Figure 5: Distribution of structural components in molecular graphs. Each subplot shows the distribution of (a) fragment sizes, (b) number of rings, (c) number of paths, and (d) number of articulation points per graph. The x-axis represents the count of each structure, and the y-axis shows the number of graphs with that count.
  • ...and 2 more figures