AI-guided Antibiotic Discovery Pipeline from Target Selection to Compound Identification

Maximilian G. Schuh; Joshua Hesse; Stephan A. Sieber

AI-guided Antibiotic Discovery Pipeline from Target Selection to Compound Identification

Maximilian G. Schuh, Joshua Hesse, Stephan A. Sieber

TL;DR

The study presents an end-to-end AI-guided antibiotic discovery pipeline that moves from structure-based target identification to real-world compound realization. By clustering predicted pathogen proteomes and benchmarking six 3D-structure-aware generative models, it demonstrates practical guidance for integrating DL tools into early-stage antibiotic design. DeepBlock and TamGen emerge as top performers across usability and output quality, with rigorous post-generation curation reducing vast output to synthesizable candidates. The work provides a benchmark, blueprint, and actionable framework for deploying AI in antibiotic discovery, with significant implications for accelerating the identification of novel antibacterial agents and reducing synthesis barriers.

Abstract

Antibiotic resistance presents a growing global health crisis, demanding new therapeutic strategies that target novel bacterial mechanisms. Recent advances in protein structure prediction and machine learning-driven molecule generation offer a promising opportunity to accelerate drug discovery. However, practical guidance on selecting and integrating these models into real-world pipelines remains limited. In this study, we develop an end-to-end, artificial intelligence-guided antibiotic discovery pipeline that spans target identification to compound realization. We leverage structure-based clustering across predicted proteomes of multiple pathogens to identify conserved, essential, and non-human-homologous targets. We then systematically evaluate six leading 3D-structure-aware generative models$\unicode{x2014}$spanning diffusion, autoregressive, graph neural network, and language model architectures$\unicode{x2014}$on their usability, chemical validity, and biological relevance. Rigorous post-processing filters and commercial analogue searches reduce over 100 000 generated compounds to a focused, synthesizable set. Our results highlight DeepBlock and TamGen as top performers across diverse criteria, while also revealing critical trade-offs between model complexity, usability, and output quality. This work provides a comparative benchmark and blueprint for deploying artificial intelligence in early-stage antibiotic development.

AI-guided Antibiotic Discovery Pipeline from Target Selection to Compound Identification

TL;DR

Abstract

AI-guided Antibiotic Discovery Pipeline from Target Selection to Compound Identification

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (12)