Accelerating Antibiotic Discovery with Large Language Models and Knowledge Graphs
Maxime Delmas, Magdalena Wysocka, Danilo Gusicuma, André Freitas
TL;DR
Antimicrobial resistance creates high costs and long development timelines, with rediscovery of known compounds posing a major risk. The authors present an LLM-based alarm system augmented by a Knowledge Graph to systematically detect prior antibiotic activity evidence across organism and chemical literature, while ensuring taxonomic and synonym resolution. The pipeline is demonstrated on a private set of 73 organisms (with 12 negative hits), achieving broad coverage of OL- and CL-evidence and prioritizing alerts into Strong/Medium/Weak to guide review; a public release of the associated KG and UI is planned. The work highlights gaps in public literature coverage, demonstrates scalable semi-automatic literature review, and offers a reusable framework for evidence-driven target prioritization in antibiotic discovery.
Abstract
The discovery of novel antibiotics is critical to address the growing antimicrobial resistance (AMR). However, pharmaceutical industries face high costs (over $1 billion), long timelines, and a high failure rate, worsened by the rediscovery of known compounds. We propose an LLM-based pipeline that acts as an alarm system, detecting prior evidence of antibiotic activity to prevent costly rediscoveries. The system integrates organism and chemical literature into a Knowledge Graph (KG), ensuring taxonomic resolution, synonym handling, and multi-level evidence classification. We tested the pipeline on a private list of 73 potential antibiotic-producing organisms, disclosing 12 negative hits for evaluation. The results highlight the effectiveness of the pipeline for evidence reviewing, reducing false negatives, and accelerating decision-making. The KG for negative hits and the user interface for interactive exploration will be made publicly available.
