Table of Contents
Fetching ...

Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation

Jiaxuan You, Bowen Liu, Rex Ying, Vijay Pande, Jure Leskovec

TL;DR

We propose Graph Convolutional Policy Network (GCPN), a graph convolutional network-based model for goal-directed molecular graph generation trained with reinforcement learning. The agent operates in an environment that enforces domain-specific rules (valency, steric strain, and reactive groups) while optimizing domain rewards and an adversarial loss via policy gradient. GCPN achieves substantial improvements in chemical property optimization, reporting a 61 percent improvement over state-of-the-art baselines and a 184 percent improvement on constrained property optimization, while producing molecules that resemble known structures. The approach integrates graph-based representation, policy-gradient optimization, and rule-based validity to balance creativity with chemical validity, offering a scalable framework for designing drug-like and synthetically accessible molecules.

Abstract

Generating novel graph structures that optimize given objectives while obeying some given underlying rules is fundamental for chemistry, biology and social science research. This is especially important in the task of molecular graph generation, whose goal is to discover novel molecules with desired properties such as drug-likeness and synthetic accessibility, while obeying physical laws such as chemical valency. However, designing models to find molecules that optimize desired properties while incorporating highly complex and non-differentiable rules remains to be a challenging task. Here we propose Graph Convolutional Policy Network (GCPN), a general graph convolutional network based model for goal-directed graph generation through reinforcement learning. The model is trained to optimize domain-specific rewards and adversarial loss through policy gradient, and acts in an environment that incorporates domain-specific rules. Experimental results show that GCPN can achieve 61% improvement on chemical property optimization over state-of-the-art baselines while resembling known molecules, and achieve 184% improvement on the constrained property optimization task.

Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation

TL;DR

We propose Graph Convolutional Policy Network (GCPN), a graph convolutional network-based model for goal-directed molecular graph generation trained with reinforcement learning. The agent operates in an environment that enforces domain-specific rules (valency, steric strain, and reactive groups) while optimizing domain rewards and an adversarial loss via policy gradient. GCPN achieves substantial improvements in chemical property optimization, reporting a 61 percent improvement over state-of-the-art baselines and a 184 percent improvement on constrained property optimization, while producing molecules that resemble known structures. The approach integrates graph-based representation, policy-gradient optimization, and rule-based validity to balance creativity with chemical validity, offering a scalable framework for designing drug-like and synthetically accessible molecules.

Abstract

Generating novel graph structures that optimize given objectives while obeying some given underlying rules is fundamental for chemistry, biology and social science research. This is especially important in the task of molecular graph generation, whose goal is to discover novel molecules with desired properties such as drug-likeness and synthetic accessibility, while obeying physical laws such as chemical valency. However, designing models to find molecules that optimize desired properties while incorporating highly complex and non-differentiable rules remains to be a challenging task. Here we propose Graph Convolutional Policy Network (GCPN), a general graph convolutional network based model for goal-directed graph generation through reinforcement learning. The model is trained to optimize domain-specific rewards and adversarial loss through policy gradient, and acts in an environment that incorporates domain-specific rules. Experimental results show that GCPN can achieve 61% improvement on chemical property optimization over state-of-the-art baselines while resembling known molecules, and achieve 184% improvement on the constrained property optimization task.

Paper Structure

This paper contains 1 section.

Table of Contents

  1. Appendix