Can LLMs Generate Diverse Molecules? Towards Alignment with Structural Diversity
Hyosoon Jang, Yunhui Jang, Jaehyung Kim, Sungsoo Ahn
TL;DR
This work addresses the scarce molecular diversity in LLM-generated candidates by introducing a two-stage fine-tuning framework (Div-SFT and Div-SFT+RL) that enables autoregressive generation of a sequence of structurally diverse molecules conditioned on previously generated ones. The method combines supervised fine-tuning to produce a diverse molecule sequence with reinforcement learning that optimizes molecule-wise diversity via a PPO-based policy, using a diversity reward and a description-matching reward. Empirical results on description-guided tasks show the approach outperforms decoding-based diversification and existing LLM baselines in diversity metrics such as NCircles, while maintaining quality. The proposed framework also demonstrates generalization to generalist LLMs and unseen prompts, suggesting broad applicability for robust, diverse candidate design in drug discovery, albeit with computational and real-world synthesis validation considerations.
Abstract
Recent advancements in large language models (LLMs) have demonstrated impressive performance in molecular generation, which offers potential to accelerate drug discovery. However, the current LLMs overlook a critical requirement for drug discovery: proposing a diverse set of molecules. This diversity is essential for improving the chances of finding a viable drug, as it provides alternative molecules that may succeed where others fail in real-world validations. Nevertheless, the LLMs often output structurally similar molecules. While decoding schemes like diverse beam search may enhance textual diversity, this often does not align with molecular structural diversity. In response, we propose a new method for fine-tuning molecular generative LLMs to autoregressively generate a set of structurally diverse molecules, where each molecule is generated by conditioning on the previously generated molecules. Our approach consists of two stages: (1) supervised fine-tuning to adapt LLMs to autoregressively generate molecules in a sequence and (2) reinforcement learning to maximize structural diversity within the generated molecules. Our experiments show that the proposed approach enables LLMs to generate diverse molecules better than existing approaches for diverse sequence generation.
