Table of Contents
Fetching ...

Fused Gromov-Wasserstein Contrastive Learning for Effective Enzyme-Reaction Screening

Gengmo Zhou, Feng Yu, Wenda Wang, Zhifeng Gao, Guolin Ke, Zhewei Wei, Zhen Wang

TL;DR

FGW-CLIP introduces a fused Gromov-Wasserstein-based contrastive framework to enzyme screening, jointly aligning reaction and enzyme spaces while preserving intra-domain structure. By coupling inter-domain matching with GW-based regularization and EC-level supervision, it achieves state-of-the-art performance on EnzymeMap and ReactZyme benchmarks, including strong generalization to unseen enzymes and reactions. The method demonstrates clear gains in early enrichment metrics (BEDROC) and top-ranked hit retrieval, supported by comprehensive ablations and visualizations. This approach offers a scalable, robust tool for enzyme discovery in complex biochemical settings and has potential implications for biocatalysis, drug discovery, and sustainable chemistry.

Abstract

Enzymes are crucial catalysts that enable a wide range of biochemical reactions. Efficiently identifying specific enzymes from vast protein libraries is essential for advancing biocatalysis. Traditional computational methods for enzyme screening and retrieval are time-consuming and resource-intensive. Recently, deep learning approaches have shown promise. However, these methods focus solely on the interaction between enzymes and reactions, overlooking the inherent hierarchical relationships within each domain. To address these limitations, we introduce FGW-CLIP, a novel contrastive learning framework based on optimizing the fused Gromov-Wasserstein distance. FGW-CLIP incorporates multiple alignments, including inter-domain alignment between reactions and enzymes and intra-domain alignment within enzymes and reactions. By introducing a tailored regularization term, our method minimizes the Gromov-Wasserstein distance between enzyme and reaction spaces, which enhances information integration across these domains. Extensive evaluations demonstrate the superiority of FGW-CLIP in challenging enzyme-reaction tasks. On the widely-used EnzymeMap benchmark, FGW-CLIP achieves state-of-the-art performance in enzyme virtual screening, as measured by BEDROC and EF metrics. Moreover, FGW-CLIP consistently outperforms across all three splits of ReactZyme, the largest enzyme-reaction benchmark, demonstrating robust generalization to novel enzymes and reactions. These results position FGW-CLIP as a promising framework for enzyme discovery in complex biochemical settings, with strong adaptability across diverse screening scenarios.

Fused Gromov-Wasserstein Contrastive Learning for Effective Enzyme-Reaction Screening

TL;DR

FGW-CLIP introduces a fused Gromov-Wasserstein-based contrastive framework to enzyme screening, jointly aligning reaction and enzyme spaces while preserving intra-domain structure. By coupling inter-domain matching with GW-based regularization and EC-level supervision, it achieves state-of-the-art performance on EnzymeMap and ReactZyme benchmarks, including strong generalization to unseen enzymes and reactions. The method demonstrates clear gains in early enrichment metrics (BEDROC) and top-ranked hit retrieval, supported by comprehensive ablations and visualizations. This approach offers a scalable, robust tool for enzyme discovery in complex biochemical settings and has potential implications for biocatalysis, drug discovery, and sustainable chemistry.

Abstract

Enzymes are crucial catalysts that enable a wide range of biochemical reactions. Efficiently identifying specific enzymes from vast protein libraries is essential for advancing biocatalysis. Traditional computational methods for enzyme screening and retrieval are time-consuming and resource-intensive. Recently, deep learning approaches have shown promise. However, these methods focus solely on the interaction between enzymes and reactions, overlooking the inherent hierarchical relationships within each domain. To address these limitations, we introduce FGW-CLIP, a novel contrastive learning framework based on optimizing the fused Gromov-Wasserstein distance. FGW-CLIP incorporates multiple alignments, including inter-domain alignment between reactions and enzymes and intra-domain alignment within enzymes and reactions. By introducing a tailored regularization term, our method minimizes the Gromov-Wasserstein distance between enzyme and reaction spaces, which enhances information integration across these domains. Extensive evaluations demonstrate the superiority of FGW-CLIP in challenging enzyme-reaction tasks. On the widely-used EnzymeMap benchmark, FGW-CLIP achieves state-of-the-art performance in enzyme virtual screening, as measured by BEDROC and EF metrics. Moreover, FGW-CLIP consistently outperforms across all three splits of ReactZyme, the largest enzyme-reaction benchmark, demonstrating robust generalization to novel enzymes and reactions. These results position FGW-CLIP as a promising framework for enzyme discovery in complex biochemical settings, with strong adaptability across diverse screening scenarios.

Paper Structure

This paper contains 51 sections, 2 theorems, 29 equations, 2 figures, 7 tables.

Key Result

Proposition 1

Given encoder $f_{\psi_{1}}$ for data field $X_{1}$ and encoder $f_{\psi_{2}}$ for data field $X_{2}$, $x_{\psi_{1}}$ represents the l2 normalized embeddings of $X_{1}$ from $f_{\psi_{1}}$, while $x_{\psi_{2}}$ represents the l2 normalized embeddings of $X_{2}$ from $f_{\psi_{2}}$. $\Gamma^{f_{1}}$ where $KL(X||Y) = \sum\limits_{ij}x_{ij}log\frac{x_{ij}}{y_{ij}} - x_{ij} + y_{ij}$ represents the

Figures (2)

  • Figure 1: Overview of FGW-CLIP Framework. Reactants and products, along with their structures, are input into the Molecule Encoder, while enzyme sequences are input into the Enzymes Encoder. Representations of enzymes and reactions are aligned using $L_{\text{reaction-enzyme}}$. Internal alignments within reactions and enzymes are achieved via $L_{\text{reaction}}$ and $L_{\text{enzyme}}$, respectively, utilizing EC numbers. Inspired by the goal of minimizing the Gromov- Wasserstein (GW) distance, we introduce a regularization term $L_{\text{GW}}$. During training, the Enzyme Encoder is frozen, and a projection head is appended to its output. For clarity, only one classification head is shown in the figure.
  • Figure 2: t-SNE visualization of enzyme representations. Left: pretrained checkpoint. Right: FGW-CLIP. Colors represent top-level EC numbers (EC 1 to EC 6).

Theorems & Definitions (2)

  • Proposition 1
  • Lemma 2