Table of Contents
Fetching ...

Learning Molecular Chirality via Chiral Determinant Kernels

Runhan Shi, Zhicheng Zhang, Letian Chen, Gufeng Yu, Yang Yang

TL;DR

This work tackles the challenge of learning molecular chirality by introducing ChiDeK, a unified framework that explicitly encodes central and axial stereochemistry through chiral determinant kernels and a cross-attentive chiral transformer. The approach embeds the SE(3)-invariant chirality matrix via a differentiable kernel and propagates stereochemical signals to non-chiral atoms, enabling robust handling of diverse chiral forms. A new axial-chirality benchmark (ACMP) for ECD and OR prediction is introduced, and ChiDeK shows substantial improvements over state-of-the-art baselines, especially for axial chirality (average gains >7%). The paper also provides theoretical analysis of the chiral encoder's invariance properties and includes extensive ablations and robustness analyses, supporting the practical utility of explicit stereochemical encoding in molecular ML.

Abstract

Chirality is a fundamental molecular property that governs stereospecific behavior in chemistry and biology. Capturing chirality in machine learning models remains challenging due to the geometric complexity of stereochemical relationships and the limitations of traditional molecular representations that often lack explicit stereochemical encoding. Existing approaches to chiral molecular representation primarily focus on central chirality, relying on handcrafted stereochemical tags or limited 3D encodings, and thus fail to generalize to more complex forms such as axial chirality. In this work, we introduce ChiDeK (Chiral Determinant Kernels), a framework that systematically integrates stereogenic information into molecular representation learning. We propose the chiral determinant kernel to encode the SE(3)-invariant chirality matrix and employ cross-attention to integrate stereochemical information from local chiral centers into the global molecular representation. This design enables explicit modeling of chiral-related features within a unified architecture, capable of jointly encoding central and axial chirality. To support the evaluation of axial chirality, we construct a new benchmark for electronic circular dichroism (ECD) and optical rotation (OR) prediction. Across four tasks, including R/S configuration classification, enantiomer ranking, ECD spectrum prediction, and OR prediction, ChiDeK achieves substantial improvements over state-of-the-art baselines, most notably yielding over 7% higher accuracy on axially chiral tasks on average.

Learning Molecular Chirality via Chiral Determinant Kernels

TL;DR

This work tackles the challenge of learning molecular chirality by introducing ChiDeK, a unified framework that explicitly encodes central and axial stereochemistry through chiral determinant kernels and a cross-attentive chiral transformer. The approach embeds the SE(3)-invariant chirality matrix via a differentiable kernel and propagates stereochemical signals to non-chiral atoms, enabling robust handling of diverse chiral forms. A new axial-chirality benchmark (ACMP) for ECD and OR prediction is introduced, and ChiDeK shows substantial improvements over state-of-the-art baselines, especially for axial chirality (average gains >7%). The paper also provides theoretical analysis of the chiral encoder's invariance properties and includes extensive ablations and robustness analyses, supporting the practical utility of explicit stereochemical encoding in molecular ML.

Abstract

Chirality is a fundamental molecular property that governs stereospecific behavior in chemistry and biology. Capturing chirality in machine learning models remains challenging due to the geometric complexity of stereochemical relationships and the limitations of traditional molecular representations that often lack explicit stereochemical encoding. Existing approaches to chiral molecular representation primarily focus on central chirality, relying on handcrafted stereochemical tags or limited 3D encodings, and thus fail to generalize to more complex forms such as axial chirality. In this work, we introduce ChiDeK (Chiral Determinant Kernels), a framework that systematically integrates stereogenic information into molecular representation learning. We propose the chiral determinant kernel to encode the SE(3)-invariant chirality matrix and employ cross-attention to integrate stereochemical information from local chiral centers into the global molecular representation. This design enables explicit modeling of chiral-related features within a unified architecture, capable of jointly encoding central and axial chirality. To support the evaluation of axial chirality, we construct a new benchmark for electronic circular dichroism (ECD) and optical rotation (OR) prediction. Across four tasks, including R/S configuration classification, enantiomer ranking, ECD spectrum prediction, and OR prediction, ChiDeK achieves substantial improvements over state-of-the-art baselines, most notably yielding over 7% higher accuracy on axially chiral tasks on average.
Paper Structure (38 sections, 3 theorems, 34 equations, 10 figures, 10 tables)

This paper contains 38 sections, 3 theorems, 34 equations, 10 figures, 10 tables.

Key Result

Proposition 3.1

Given a chiral atom $i$, the chirality product $P_\mathrm{C}(i)$ is invariant under rigid-body translation and rotation, and changes sign under reflection: where $R_1 \in \mathrm{SE}(3)$ denotes any rigid-body motion with rotation $r \in \mathrm{SO}(3)$ and translation $t \in \mathbb{R}^3$, and $R_2 \in O^{-}(3)$ denotes any reflection (orthogonal transformation with determinant $-1$).

Figures (10)

  • Figure 1: Examples of central chirality (a) and axial chirality (b). The configuration of the chiral atoms (in green) is determined by the spatial arrangement of chiral-related atoms (in orange).
  • Figure 2: Overview of the ChiDeK architecture (a). It consists of a chiral encoder (b), a chiral transformer incorporating $L$ cross attention layers (c), and a predictor for predicting chiral properties.
  • Figure 3: An axial ECD prediction example of a pair of enantiomers. Each row shows the predictions for one configuration across models, with the two rows corresponding to opposite configurations.
  • Figure 4: Visualization of how ChiDeK representations change under rotations along the chiral axis for two molecules. Each point in the trajectory plot obtained by UMAP umap represents the learned embedding at a given rotation angle, while each point in the polar plot denotes the cosine similarity between the embedding and the reference at degree 0.
  • Figure 5: The ECD curve example for an axial chiral molecule.
  • ...and 5 more figures

Theorems & Definitions (5)

  • Proposition 3.1
  • Lemma 3.1
  • Lemma 4.1
  • proof : Proof of Lemma \ref{['lemma:enantiomers']}
  • proof : Proof of Lemma \ref{['lemma:generalized_chirality']}