Improving Molecular Modeling with Geometric GNNs: an Empirical Study
Ali Ramlaoui, Théo Saulus, Basile Terver, Victor Schmidt, David Rolnick, Fragkiskos D. Malliaros, Alexandre Duval
TL;DR
The paper conducts an empirical study of Geometric GNNs for 3D atomic systems, probing how canonicalization, graph creation, and auxiliary tasks affect predictive performance and symmetry handling on OC20 and QM9 tasks. It finds that approximate canonicalization methods like SFA can rival or exceed exact $E(3)$-equivariant approaches in practice, and that graph construction is robust across a range of cutoffs, with graph rewiring offering substantial scalability gains. Noisy Nodes emerge as a powerful auxiliary task enabling much deeper networks to outperform shallower baselines, while pre-training on larger, related tasks shows transferable benefits though final convergence closely tracks direct training. Overall, the study provides practical guidance for selecting modeling components and highlights promising directions in transfer learning and multi-task architectures for molecular property prediction.
Abstract
Rapid advancements in machine learning (ML) are transforming materials science by significantly speeding up material property calculations. However, the proliferation of ML approaches has made it challenging for scientists to keep up with the most promising techniques. This paper presents an empirical study on Geometric Graph Neural Networks for 3D atomic systems, focusing on the impact of different (1) canonicalization methods, (2) graph creation strategies, and (3) auxiliary tasks, on performance, scalability and symmetry enforcement. Our findings and insights aim to guide researchers in selecting optimal modeling components for molecular modeling tasks.
