Table of Contents
Fetching ...

Toward Closed-loop Molecular Discovery via Language Model, Property Alignment and Strategic Search

Junkai Ji, Zhangfan Yang, Dong Xu, Ruibin Bai, Jianqiang Li, Tingjun Hou, Zexuan Zhu

TL;DR

By combining generalization, plausibility, and interpretability, Trio establishes a closed-loop generative paradigm that redefines how chemical space can be navigated, offering a transformative foundation for the next era of AI-driven drug discovery.

Abstract

Drug discovery is a time-consuming and expensive process, with traditional high-throughput and docking-based virtual screening hampered by low success rates and limited scalability. Recent advances in generative modelling, including autoregressive, diffusion, and flow-based approaches, have enabled de novo ligand design beyond the limits of enumerative screening. Yet these models often suffer from inadequate generalization, limited interpretability, and an overemphasis on binding affinity at the expense of key pharmacological properties, thereby restricting their translational utility. Here we present Trio, a molecular generation framework integrating fragment-based molecular language modeling, reinforcement learning, and Monte Carlo tree search, for effective and interpretable closed-loop targeted molecular design. Through the three key components, Trio enables context-aware fragment assembly, enforces physicochemical and synthetic feasibility, and guides a balanced search between the exploration of novel chemotypes and the exploitation of promising intermediates within protein binding pockets. Experimental results show that Trio reliably achieves chemically valid and pharmacologically enhanced ligands, outperforming state-of-the-art approaches with improved binding affinity (+7.85%), drug-likeness (+11.10%) and synthetic accessibility (+12.05%), while expanding molecular diversity more than fourfold. By combining generalization, plausibility, and interpretability, Trio establishes a closed-loop generative paradigm that redefines how chemical space can be navigated, offering a transformative foundation for the next era of AI-driven drug discovery.

Toward Closed-loop Molecular Discovery via Language Model, Property Alignment and Strategic Search

TL;DR

By combining generalization, plausibility, and interpretability, Trio establishes a closed-loop generative paradigm that redefines how chemical space can be navigated, offering a transformative foundation for the next era of AI-driven drug discovery.

Abstract

Drug discovery is a time-consuming and expensive process, with traditional high-throughput and docking-based virtual screening hampered by low success rates and limited scalability. Recent advances in generative modelling, including autoregressive, diffusion, and flow-based approaches, have enabled de novo ligand design beyond the limits of enumerative screening. Yet these models often suffer from inadequate generalization, limited interpretability, and an overemphasis on binding affinity at the expense of key pharmacological properties, thereby restricting their translational utility. Here we present Trio, a molecular generation framework integrating fragment-based molecular language modeling, reinforcement learning, and Monte Carlo tree search, for effective and interpretable closed-loop targeted molecular design. Through the three key components, Trio enables context-aware fragment assembly, enforces physicochemical and synthetic feasibility, and guides a balanced search between the exploration of novel chemotypes and the exploitation of promising intermediates within protein binding pockets. Experimental results show that Trio reliably achieves chemically valid and pharmacologically enhanced ligands, outperforming state-of-the-art approaches with improved binding affinity (+7.85%), drug-likeness (+11.10%) and synthetic accessibility (+12.05%), while expanding molecular diversity more than fourfold. By combining generalization, plausibility, and interpretability, Trio establishes a closed-loop generative paradigm that redefines how chemical space can be navigated, offering a transformative foundation for the next era of AI-driven drug discovery.

Paper Structure

This paper contains 18 sections, 2 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Overview and motivation of the proposed Trio framework. a, Limits of prior paradigms. Sequence-based (SMILES) models miss 3D context and inter-fragment semantics; search-based GA/MCTS depend on fixed fragment libraries and hand-crafted link rules, creating complicated and slow searches; structure-based 2D/3D generators need scarce protein-ligand pairs and risk geometric distortion. b, Trio pipeline. Stage 1: Pre-train: FRAGPT, a fragment language model trained on FragSeqs, learns context-aware attachments to assemble valid molecules step-by-step. Stage 2: Preference alignment, DPO with QED/SA pairs biases the policy toward synthesizable, drug-like compounds. Stage 3: Pocket-conditioned planning, the DPO-aligned policy drives MCTS with UCB over Selection-Expansion-Simulation-Backpropagation, combining affinity rewards to rank routes.
  • Figure 2: FRAGPT for De Novo and Fragment-Constrained Molecular Generation: Representations, Models, Tasks, and Performancea, Two fragment-based SMILES representations: SAFE and FragSeq, illustrating tokenization and ordering; b, Two language-model families for molecule generation: diffusion with random sampling and GPT with step-by-step masked prediction; c, Task taxonomy. Linker generation and scaffold morphing share the same conditional form but use different given fragments. Motif extension, scaffold decoration, and superstructure generation also share a common form, conditioned respectively on a motif, a scaffold, or a superstructure; d, De novo generation: four models compared on the core metrics; e, Task-wise performance of three models across LD (Linker Design), SM (Scaffold Morphing), ME (Motif Extension), SD (Scaffold Decoration), and SG (Superstructure Generation) on Validity, Uniqueness, Diversity, and Distance. Validity is the percentage of chemically valid molecules. Uniqueness is the proportion of unique molecules among the valid ones. Diversity measures internal structural dissimilarity within the generated set. Distance measures structural similarity to a reference molecule; values approaching 1 indicate greater dissimilarity.
  • Figure 3: Comparative characterization of generated chemical spaces across baseline data and generative models.a, Two-dimensional t-SNE of MACCS fingerprints of 10000 generated molecules per set, showing pairwise overlaps between DATASET, FRAGPT, SAFEGPT and FRAGPT-DPO; b, Box plots of drug-likeness (QED) and synthetic accessibility (SA) for the same sets; c, Hexbin density maps of the QED-SA landscape; d, Statistical analysis of generated molecular substructures. A comparison of atom, bond, and ring distributions between the reference dataset and molecules from three generative models. (Top) Relative frequency plots show the proportion of each substructure category within each data source. (Bottom) Normalized count plots compare the prevalence of each substructure across the different sources, with values for each category scaled by the maximum observed count.
  • Figure 4: Performance and Diversity Analysis on Five Therapeutic Targets. This figure evaluates the effectiveness and diversity of molecules generated by our proposed models, Trio* and Trio, against several baseline methods. a, Box plots comparing the distributions of Vina Docking Score (top), Quantitative Estimate of Drug-likeness (QED, middle), and Synthetic Accessibility (SA, bottom) for molecules generated by GEAM, Trio*, and Trio; b, Hyperparameter sensitivity analysis for Trio* and Trio. The plots show the average Vina Docking Score from 20 independent runs as a function of varying Search Steps (top) and Search Width (bottom); c, Molecular diversity analysis using the #Circles metric. Diversity is quantified by calculating the maximum number of molecules that can be selected from a generated set of 3,000, such that every pair of selected molecules exceeds a minimum distance threshold. A higher #Circles value signifies greater diversity and exploration of the chemical space.
  • Figure 5: Illustration of the Trio framework's stepwise generative mechanism and the intermolecular interactions between generated ligands and target protein binding pockets.a, Schematic illustration of the Monte Carlo Tree Search for target-based de novo generation. Starting from the [BOS] root token, molecules are constructed via iterative fragment addition (Layers 1–5) and prioritized by AutoDock Vina scores to identify the optimal candidate (crown icon); b, Predicted binding modes of generated leads against target proteins. Detailed views of the binding pockets for 5ht1b, braf, fa7, jak2, and parp1 highlight key non-covalent interactions. Contacts are color-coded: hydrophobic (warmpink dashed), hydrogen bonds (forestgreen solid), and $\pi$–$\pi$ stacking (teal dashed).