Predicting Side Effect of Drug Molecules using Recurrent Neural Networks

Collin Beaudoin; Koustubh Phalak; Swaroop Ghosh

Predicting Side Effect of Drug Molecules using Recurrent Neural Networks

Collin Beaudoin, Koustubh Phalak, Swaroop Ghosh

TL;DR

The study tackles the challenge of predicting drug side effects from molecular structures by using a GRU-based recurrent neural network that leverages SELFIES representations to learn context from SMILES-derived sequences. It achieves a dramatic parameter-efficiency, reporting a $98\%$–$99\%$ reduction in parameters compared with large graph-based or language models while delivering near-state-of-the-art accuracy on MoleculeNet benchmarks (e.g., SIDER, BBBP, ClinTox). The authors demonstrate strong ROC-AUC performance with a lightweight model, and provide detailed cross-dataset comparisons against GROVER, ChemRL-GEM, and Galactica, highlighting practical advantages in compute and data requirements. The work suggests that accessible, smaller sequence models can substantially democratize molecular property prediction, enabling chemists to perform rapid, pre-experimental screening with meaningful impact on drug development timelines and costs.

Abstract

Identification and verification of molecular properties such as side effects is one of the most important and time-consuming steps in the process of molecule synthesis. For example, failure to identify side effects before submission to regulatory groups can cost millions of dollars and months of additional research to the companies. Failure to identify side effects during the regulatory review can also cost lives. The complexity and expense of this task have made it a candidate for a machine learning-based solution. Prior approaches rely on complex model designs and excessive parameter counts for side effect predictions. We believe reliance on complex models only shifts the difficulty away from chemists rather than alleviating the issue. Implementing large models is also expensive without prior access to high-performance computers. We propose a heuristic approach that allows for the utilization of simple neural networks, specifically the recurrent neural network, with a 98+% reduction in the number of required parameters compared to available large language models while still obtaining near identical results as top-performing models.

Predicting Side Effect of Drug Molecules using Recurrent Neural Networks

TL;DR

–

reduction in parameters compared with large graph-based or language models while delivering near-state-of-the-art accuracy on MoleculeNet benchmarks (e.g., SIDER, BBBP, ClinTox). The authors demonstrate strong ROC-AUC performance with a lightweight model, and provide detailed cross-dataset comparisons against GROVER, ChemRL-GEM, and Galactica, highlighting practical advantages in compute and data requirements. The work suggests that accessible, smaller sequence models can substantially democratize molecular property prediction, enabling chemists to perform rapid, pre-experimental screening with meaningful impact on drug development timelines and costs.

Abstract

Paper Structure (29 sections, 7 equations, 4 figures, 4 tables)

This paper contains 29 sections, 7 equations, 4 figures, 4 tables.

Introduction
Background & Related Works
MoleculeNet Benchmark
Side Effect Resource (SIDER)
BACE
Blood-brain barrier penetration (BBBP)
ClinTox
HIV
MUV
ROC-AUC
Simplified Molecular-Input Line Entry System (SMILES)
Self-Referencing Embedded Strings (SELFIES)
Recurrent Neural Networks (RNN)
Related Works
GROVER
...and 14 more sections

Figures (4)

Figure 1: Vanilla RNN architecture used for training; ($H_{t-1}, H_t$) represent the hidden state, ($O_{t}$) represents the output state, and ($X_t$) represents the input information. The $\sigma$ represents the activation function that operates on the combined input and hidden state.
Figure 2: GRU architecture used for training; ($h_{t-1}, h_t$) represent the hidden state, ($\Tilde{h}_{t}$) represents the candidate hidden state state, and ($r_t$) and ($Z_t$) represents the parameters to tune the importance of the previous hidden state versus the updated information. The $\sigma$ represents the activation function that operates on the combined input and hidden state.
Figure 3: Overview of the RNN process.
Figure 4: Results of three tasks: (a) loss curves, (b) training accuracy, (c) testing accuracy for neoplasms benign, malignant and unspecified (incl cysts and polyps) disorders, (d), (e), (f) for blood and lymphatic system disorders.

Predicting Side Effect of Drug Molecules using Recurrent Neural Networks

TL;DR

Abstract

Predicting Side Effect of Drug Molecules using Recurrent Neural Networks

Authors

TL;DR

Abstract

Table of Contents

Figures (4)