Table of Contents
Fetching ...

Open-Source Molecular Processing Pipeline for Generating Molecules

V Shreyas, Jose Siguenza, Karan Bania, Bharath Ramsundar

TL;DR

This work tackles the barrier to adopting generative molecular models by delivering open-source, PyTorch-based MolGAN and Normalizing Flows integrated into the DeepChem library. It presents modular layers, a flexible Base Model, and turnkey MolGAN and Normalizing Flows pipelines that support both 1D (SMILES/SELFIES) and 2D graph representations, with losses and transformations tailored for graph generation and invertible density estimation. The authors demonstrate competitive benchmarks against prior MolGAN implementations across MoleculeNet datasets and detail rigorous experimental setups, reporting multiple evaluation metrics such as Validity, Uniqueness, Novelty, SAS, and Drug-likeness. The practical impact is a more accessible, scalable, and extensible framework for researchers to build, compare, and extend generative molecular models, potentially accelerating discovery while acknowledging safety and scalability considerations.

Abstract

Generative models for molecules have shown considerable promise for use in computational chemistry, but remain difficult to use for non-experts. For this reason, we introduce open-source infrastructure for easily building generative molecular models into the widely used DeepChem [Ramsundar et al., 2019] library with the aim of creating a robust and reusable molecular generation pipeline. In particular, we add high quality PyTorch [Paszke et al., 2019] implementations of the Molecular Generative Adversarial Networks (MolGAN) [Cao and Kipf, 2022] and Normalizing Flows [Papamakarios et al., 2021]. Our implementations show strong performance comparable with past work [Kuznetsov and Polykovskiy, 2021, Cao and Kipf, 2022].

Open-Source Molecular Processing Pipeline for Generating Molecules

TL;DR

This work tackles the barrier to adopting generative molecular models by delivering open-source, PyTorch-based MolGAN and Normalizing Flows integrated into the DeepChem library. It presents modular layers, a flexible Base Model, and turnkey MolGAN and Normalizing Flows pipelines that support both 1D (SMILES/SELFIES) and 2D graph representations, with losses and transformations tailored for graph generation and invertible density estimation. The authors demonstrate competitive benchmarks against prior MolGAN implementations across MoleculeNet datasets and detail rigorous experimental setups, reporting multiple evaluation metrics such as Validity, Uniqueness, Novelty, SAS, and Drug-likeness. The practical impact is a more accessible, scalable, and extensible framework for researchers to build, compare, and extend generative molecular models, potentially accelerating discovery while acknowledging safety and scalability considerations.

Abstract

Generative models for molecules have shown considerable promise for use in computational chemistry, but remain difficult to use for non-experts. For this reason, we introduce open-source infrastructure for easily building generative molecular models into the widely used DeepChem [Ramsundar et al., 2019] library with the aim of creating a robust and reusable molecular generation pipeline. In particular, we add high quality PyTorch [Paszke et al., 2019] implementations of the Molecular Generative Adversarial Networks (MolGAN) [Cao and Kipf, 2022] and Normalizing Flows [Papamakarios et al., 2021]. Our implementations show strong performance comparable with past work [Kuznetsov and Polykovskiy, 2021, Cao and Kipf, 2022].
Paper Structure (28 sections, 7 equations, 4 figures, 3 tables)

This paper contains 28 sections, 7 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Model Architecture of MolGAN
  • Figure 2: Model Architecture of Normalizing Flow models
  • Figure 3: Molecule generation pipeline for MolGAN
  • Figure 4: Molecule generation pipeline for Normalizing flows