Table of Contents
Fetching ...

Deconstructing equivariant representations in molecular systems

Kin Long Kelvin Lee, Mikhail Galkin, Santiago Miret

TL;DR

This work analyzes how equivariant representations in molecular graph models encode information for scalar property prediction on QM9. Using a simple GNN with spherical-harmonic embeddings up to order $L$ and PHATE-based latent-space analyses, the authors find that higher-order irreps (notably $l=1$ and $l=2$) are often unused and can degrade performance when included. Pruning these orders (e.g., using $L=[0,3,4,5,6]$) yields substantial gains and clearer latent structure, suggesting that $L$ should be treated as a tunable hyperparameter rather than a convergence requirement. The study proposes regularization, targeted pruning, and equivariant-pretraining as practical directions to improve efficiency and utilization of equivariant features in tensor-product based models, and provides a methodological framework for diagnosing latent representations in such systems.

Abstract

Recent equivariant models have shown significant progress in not just chemical property prediction, but as surrogates for dynamical simulations of molecules and materials. Many of the top performing models in this category are built within the framework of tensor products, which preserves equivariance by restricting interactions and transformations to those that are allowed by symmetry selection rules. Despite being a core part of the modeling process, there has not yet been much attention into understanding what information persists in these equivariant representations, and their general behavior outside of benchmark metrics. In this work, we report on a set of experiments using a simple equivariant graph convolution model on the QM9 dataset, focusing on correlating quantitative performance with the resulting molecular graph embeddings. Our key finding is that, for a scalar prediction task, many of the irreducible representations are simply ignored during training -- specifically those pertaining to vector ($l=1$) and tensor quantities ($l=2$) -- an issue that does not necessarily make itself evident in the test metric. We empirically show that removing some unused orders of spherical harmonics improves model performance, correlating with improved latent space structure. We provide a number of recommendations for future experiments to try and improve efficiency and utilization of equivariant features based on these observations.

Deconstructing equivariant representations in molecular systems

TL;DR

This work analyzes how equivariant representations in molecular graph models encode information for scalar property prediction on QM9. Using a simple GNN with spherical-harmonic embeddings up to order and PHATE-based latent-space analyses, the authors find that higher-order irreps (notably and ) are often unused and can degrade performance when included. Pruning these orders (e.g., using ) yields substantial gains and clearer latent structure, suggesting that should be treated as a tunable hyperparameter rather than a convergence requirement. The study proposes regularization, targeted pruning, and equivariant-pretraining as practical directions to improve efficiency and utilization of equivariant features in tensor-product based models, and provides a methodological framework for diagnosing latent representations in such systems.

Abstract

Recent equivariant models have shown significant progress in not just chemical property prediction, but as surrogates for dynamical simulations of molecules and materials. Many of the top performing models in this category are built within the framework of tensor products, which preserves equivariance by restricting interactions and transformations to those that are allowed by symmetry selection rules. Despite being a core part of the modeling process, there has not yet been much attention into understanding what information persists in these equivariant representations, and their general behavior outside of benchmark metrics. In this work, we report on a set of experiments using a simple equivariant graph convolution model on the QM9 dataset, focusing on correlating quantitative performance with the resulting molecular graph embeddings. Our key finding is that, for a scalar prediction task, many of the irreducible representations are simply ignored during training -- specifically those pertaining to vector () and tensor quantities () -- an issue that does not necessarily make itself evident in the test metric. We empirically show that removing some unused orders of spherical harmonics improves model performance, correlating with improved latent space structure. We provide a number of recommendations for future experiments to try and improve efficiency and utilization of equivariant features based on these observations.

Paper Structure

This paper contains 16 sections, 1 equation, 7 figures, 2 tables.

Figures (7)

  • Figure 1: PHATE embedding projections for two configurations: for the same hidden dimension ($h=16$) and trained for the same number of epochs, \ref{['subfig:conventional']} uses a contiguous basis, while \ref{['subfig:unconventional']} skips $l=1,2$ in favor of adding $l=3,4,5,6$. The leftmost panel shows the PHATE projection when considering unified embeddings; from left to right, we decompose the unified embeddings into feature spaces that correspond to specific irreducible representations.
  • Figure 2: Tensor product paths for configurations of $L$ considered in this work. Input feature representations are shown in the top left of each diagram, with spherical harmonics on the right and outputs on the bottom. Here, the input features are assumed to be the output of the first interaction layer, i.e. we have already transformed the scalar atomic features.
  • Figure 3: Visualization of the reduction in the number of arithmetic operations owing to aggressive symbolic refactoring. Each scatter point represents the number of arithmetic operations to compute a particular spherical harmonic $Y_{lm}$ for projection $m$ and order $l$. Red points correspond to a naive recurrent computation (i.e. higher $l$ depends on prior terms of $l$); blue points correspond to expressions derived and implemented in this work.
  • Figure 4: PHATE embedding projections when considering the "canonical" equviariant set of spherical harmonics ($l=0,1,2$). With the exception of the first row, we include an one additional set of higher order spherical harmonics of even parity, increasing in orbital angular momentum from top to bottom.
  • Figure 5: PHATE projections for a contiguous set of $L=0,1,2,3,4$, with a hidden dimension of 16. This is directly comparable and consistent with Figure \ref{['subfig:conventional']}: $l=1,2,3$ do not contain structure in their embeddings, but $l=4$ appears to contain information.
  • ...and 2 more figures