Convolutional Networks on Graphs for Learning Molecular Fingerprints
David Duvenaud, Dougal Maclaurin, Jorge Aguilera-Iparraguirre, Rafael Gómez-Bombarelli, Timothy Hirzel, Alán Aspuru-Guzik, Ryan P. Adams
TL;DR
<3-5 sentence high-level summary> The paper addresses learning properties of molecules when inputs are graphs of varying size and topology. It introduces neural graph fingerprints, a differentiable generalization of fixed circular fingerprints, enabling end-to-end gradient-based optimization over both local neighborhood updates and global pooling. The approach achieves competitive or superior predictive performance across solubility, drug efficacy, and photovoltaic efficiency tasks, while offering interpretable activations linked to chemical substructures. This work demonstrates a scalable pathway to data-driven molecular feature learning, with potential impact on QSAR, materials design, and virtual screening.
Abstract
We introduce a convolutional neural network that operates directly on graphs. These networks allow end-to-end learning of prediction pipelines whose inputs are graphs of arbitrary size and shape. The architecture we present generalizes standard molecular feature extraction methods based on circular fingerprints. We show that these data-driven features are more interpretable, and have better predictive performance on a variety of tasks.
