GFlowNets for Learning Better Drug-Drug Interaction Representations
Azmine Toushik Wasi
TL;DR
DDI prediction is hampered by severe class imbalance across interaction types, biasing models toward frequent interactions. The authors introduce a framework combining Generative Flow Networks (GFlowNets) with a Variational Graph Autoencoder (VGAE) to generate balanced synthetic DDI samples, guided by a reward that favors rare types and VGAE plausibility. An end-to-end pipeline pre-trains VGAE, trains a GFlowNet with a Trajectory Balance loss, augments data with synthetic samples, and re-trains the VGAE for final prediction. On DrugBank data, diversity and coverage of rare interaction types improve substantially while standard predictive metrics remain high, demonstrating more robust and clinically relevant DDI representations. This approach offers a scalable strategy for imbalanced biomedical graph problems and can generalize to other rare-event prediction tasks.
Abstract
Drug-drug interactions pose a significant challenge in clinical pharmacology, with severe class imbalance among interaction types limiting the effectiveness of predictive models. Common interactions dominate datasets, while rare but critical interactions remain underrepresented, leading to poor model performance on infrequent cases. Existing methods often treat DDI prediction as a binary problem, ignoring class-specific nuances and exacerbating bias toward frequent interactions. To address this, we propose a framework combining Generative Flow Networks (GFlowNet) with Variational Graph Autoencoders (VGAE) to generate synthetic samples for rare classes, improving model balance and generate effective and novel DDI pairs. Our approach enhances predictive performance across interaction types, ensuring better clinical reliability.
