Table of Contents
Fetching ...

Bridging the Gap Between Molecule and Textual Descriptions via Substructure-aware Alignment

Hyuntae Park, Yeachan Kim, SangKeun Lee

TL;DR

MolBridge introduces substructure-aware alignments to bridge molecules and text, addressing the sparse and coarse-grained nature of prior molecule–text representations. It augments data with explicit substructure–text and phrase–molecule links and trains with substructure-aware contrastive learning plus a self-refinement loop to filter noisy signals. A generative extension, MolBridge-Gen, leverages substructure–phrase relations for captioning and generation, enhancing both discriminative and generative molecular tasks. Across zero-shot retrieval, property prediction, and generation benchmarks, MolBridge consistently outperforms baselines, demonstrating that fine-grained, fragment-level supervision improves chemical understanding and practical downstream performance.

Abstract

Molecule and text representation learning has gained increasing interest due to its potential for enhancing the understanding of chemical information. However, existing models often struggle to capture subtle differences between molecules and their descriptions, as they lack the ability to learn fine-grained alignments between molecular substructures and chemical phrases. To address this limitation, we introduce MolBridge, a novel molecule-text learning framework based on substructure-aware alignments. Specifically, we augment the original molecule-description pairs with additional alignment signals derived from molecular substructures and chemical phrases. To effectively learn from these enriched alignments, MolBridge employs substructure-aware contrastive learning, coupled with a self-refinement mechanism that filters out noisy alignment signals. Experimental results show that MolBridge effectively captures fine-grained correspondences and outperforms state-of-the-art baselines on a wide range of molecular benchmarks, highlighting the significance of substructure-aware alignment in molecule-text learning.

Bridging the Gap Between Molecule and Textual Descriptions via Substructure-aware Alignment

TL;DR

MolBridge introduces substructure-aware alignments to bridge molecules and text, addressing the sparse and coarse-grained nature of prior molecule–text representations. It augments data with explicit substructure–text and phrase–molecule links and trains with substructure-aware contrastive learning plus a self-refinement loop to filter noisy signals. A generative extension, MolBridge-Gen, leverages substructure–phrase relations for captioning and generation, enhancing both discriminative and generative molecular tasks. Across zero-shot retrieval, property prediction, and generation benchmarks, MolBridge consistently outperforms baselines, demonstrating that fine-grained, fragment-level supervision improves chemical understanding and practical downstream performance.

Abstract

Molecule and text representation learning has gained increasing interest due to its potential for enhancing the understanding of chemical information. However, existing models often struggle to capture subtle differences between molecules and their descriptions, as they lack the ability to learn fine-grained alignments between molecular substructures and chemical phrases. To address this limitation, we introduce MolBridge, a novel molecule-text learning framework based on substructure-aware alignments. Specifically, we augment the original molecule-description pairs with additional alignment signals derived from molecular substructures and chemical phrases. To effectively learn from these enriched alignments, MolBridge employs substructure-aware contrastive learning, coupled with a self-refinement mechanism that filters out noisy alignment signals. Experimental results show that MolBridge effectively captures fine-grained correspondences and outperforms state-of-the-art baselines on a wide range of molecular benchmarks, highlighting the significance of substructure-aware alignment in molecule-text learning.

Paper Structure

This paper contains 42 sections, 5 equations, 8 figures, 23 tables.

Figures (8)

  • Figure 1: Illustration of our framework to learn fine-grained alignments.
  • Figure 2: Examples of chemical phrase retrieval. Identified substructures (dashed circles) and retrieved phrases from PubChem, MolBridge, and MolBridge w/o augmentation.
  • Figure 3: Analysis of the impact of Self-Refinement (SR). We evaluate both MolBridge and MolBridge trained without SR on PCDes scaffold test set.
  • Figure 4: Examples of filtered augmented pairs.
  • Figure 5: Retrieved substructures using chemical phrase by MolBridge along with the phrase-relevant molecules found in PubChem.
  • ...and 3 more figures