Table of Contents
Fetching ...

Beyond Atoms: Evaluating Electron Density Representation for 3D Molecular Learning

Patricia Suriana, Joshua A. Rackers, Ewa M. Nowara, Pedro O. Pinheiro, John M. Nicoloudis, Vishnu Sresht

TL;DR

This study benchmarks voxel-based representations for 3D molecular learning, comparing atom-type encodings against direct electron-density and its gradient, plus a Shape-Only baseline. Across two tasks—PDBbind binding affinity and QM9 quantum properties—it shows density-based inputs yield data-efficient gains in low-data regimes for binding and superior accuracy at scale for quantum properties, though results depend on the specific task and data quality. The work highlights that the optimal representation is task- and regime-dependent, with density-based inputs capturing physical electronic structure information that atom-centric schemes may miss. It also discusses practical considerations, such as the use of experimental densities versus approximated densities and the computational cost of voxel representations, pointing toward hybrid or adaptive approaches for broader applicability.

Abstract

Machine learning models for 3D molecular property prediction typically rely on atom-based representations, which may overlook subtle physical information. Electron density maps -- the direct output of X-ray crystallography and cryo-electron microscopy -- offer a continuous, physically grounded alternative. We compare three voxel-based input types for 3D convolutional neural networks (CNNs): atom types, raw electron density, and density gradient magnitude, across two molecular tasks -- protein-ligand binding affinity prediction (PDBbind) and quantum property prediction (QM9). We focus on voxel-based CNNs because electron density is inherently volumetric, and voxel grids provide the most natural representation for both experimental and computed densities. On PDBbind, all representations perform similarly with full data, but in low-data regimes, density-based inputs outperform atom types, while a shape-based baseline performs comparably -- suggesting that spatial occupancy dominates this task. On QM9, where labels are derived from Density Functional Theory (DFT) but input densities from a lower-level method (XTB), density-based inputs still outperform atom-based ones at scale, reflecting the rich structural and electronic information encoded in density. Overall, these results highlight the task- and regime-dependent strengths of density-derived inputs, improving data efficiency in affinity prediction and accuracy in quantum property modeling.

Beyond Atoms: Evaluating Electron Density Representation for 3D Molecular Learning

TL;DR

This study benchmarks voxel-based representations for 3D molecular learning, comparing atom-type encodings against direct electron-density and its gradient, plus a Shape-Only baseline. Across two tasks—PDBbind binding affinity and QM9 quantum properties—it shows density-based inputs yield data-efficient gains in low-data regimes for binding and superior accuracy at scale for quantum properties, though results depend on the specific task and data quality. The work highlights that the optimal representation is task- and regime-dependent, with density-based inputs capturing physical electronic structure information that atom-centric schemes may miss. It also discusses practical considerations, such as the use of experimental densities versus approximated densities and the computational cost of voxel representations, pointing toward hybrid or adaptive approaches for broader applicability.

Abstract

Machine learning models for 3D molecular property prediction typically rely on atom-based representations, which may overlook subtle physical information. Electron density maps -- the direct output of X-ray crystallography and cryo-electron microscopy -- offer a continuous, physically grounded alternative. We compare three voxel-based input types for 3D convolutional neural networks (CNNs): atom types, raw electron density, and density gradient magnitude, across two molecular tasks -- protein-ligand binding affinity prediction (PDBbind) and quantum property prediction (QM9). We focus on voxel-based CNNs because electron density is inherently volumetric, and voxel grids provide the most natural representation for both experimental and computed densities. On PDBbind, all representations perform similarly with full data, but in low-data regimes, density-based inputs outperform atom types, while a shape-based baseline performs comparably -- suggesting that spatial occupancy dominates this task. On QM9, where labels are derived from Density Functional Theory (DFT) but input densities from a lower-level method (XTB), density-based inputs still outperform atom-based ones at scale, reflecting the rich structural and electronic information encoded in density. Overall, these results highlight the task- and regime-dependent strengths of density-derived inputs, improving data efficiency in affinity prediction and accuracy in quantum property modeling.

Paper Structure

This paper contains 31 sections, 1 equation, 4 figures.

Figures (4)

  • Figure 1: Test Spearman correlation for binding affinity prediction on the PDBbind test set across training set sizes (1%–100%) and model capacities ($\sim$0.4M, $\sim$4M, $\sim$58M parameters). We compare four voxel-based input representations: atom types (Atom-Type), atoms mapped to carbon (Shape-Only), electron density values (Density), and electron density gradient magnitude (GradMag). Performance generally improves with training data size but plateaus beyond 10%. With the full training set, all representations perform similarly across model sizes. In the low-data regime (1%), density-based inputs outperform Atom-Type, and the Shape-Only baseline—despite discarding chemical identity—performs comparably to density-based inputs. This counterintuitive result suggests that simple spatial occupancy alone may be highly predictive in this dataset, potentially due to biases in the benchmark or the use of static, bound structures. Insets show performance at 100% training data, with stars marking best-performing models within standard deviation.
  • Figure 2: Test MAE across training set sizes for QM9 target properties. Each row corresponds to a different model size (Tiny$\sim$4M, Small$\sim$15M, Default$\sim$58M), and each column to a regression target (Dipole Moment, Polarizability, HOMO Energy, LUMO Energy). Lower MAE indicates better performance. Across all settings, increasing training data consistently reduces error. Density-based inputs (Density, GradMag) outperform atom-based ones at full data scale, while the poor performance of Shape-Only (orange) highlights the value of chemically informative features. Red dashed lines mark reported SchNet Schutt2017NIPS results, shown only as a sanity check to confirm that our 3D CNN benchmarks yield reasonable error ranges. Our models were not tuned for state-of-the-art performance—the goal is to compare voxel-based representations under consistent conditions. Because SchNet is a graph neural network operating on atom-level graphs, it cannot directly represent volumetric density data without substantial reformulation, and its results are therefore not directly comparable to ours.
  • Figure 3: Test MAE at 100% training set size for all input types, targets, and model sizes. Each row shows results for a different model size, and each column corresponds to one of the QM9 target properties. Bars show mean test MAE with standard deviation across three seeds. Density-based inputs (Density, GradMag) consistently yield the lowest errors across all targets and model sizes. Shape-Only inputs perform worst overall, highlighting the value of chemically meaningful information in voxel inputs.
  • Figure 4: Distribution of Dipole Moment Errors: XTB vs DFT. Histogram of dipole moment errors (XTB $-$ DFT) across the QM9 dataset. The red solid line shows a Gaussian fit to the full error distribution ($\mu = 1.63$, $\sigma = 3.60$ Debye), while the dashed red line indicates zero error. The inset shows the distribution with outliers removed (5th–95th percentile), highlighting the skew and overprediction tendency of XTB. XTB overpredicts dipole moments in 88.8% of cases.