Optimizing Gene-Based Testing for Antibiotic Resistance Prediction
David Hagerman, Anna Johnning, Roman Naeem, Fredrik Kahl, Erik Kristiansson, Lennart Svensson
TL;DR
Antibiotic resistance prediction requires rapid, cost-effective diagnostics. The authors propose GenoARM, a framework that combines reinforcement learning with a transformer-based AR predictor to optimally select a small subset of PCR gene tests, leveraging metadata to boost accuracy. Empirical results across five pathogens show that incorporating metadata substantially improves performance, with GenoARM often achieving the best predictions under a 5-gene test budget; RandEvolve remains a strong baseline. The work demonstrates that near-full-genome predictive power can be achieved with a carefully chosen, small gene panel, offering practical implications for clinical diagnostics and resource-constrained settings.
Abstract
Antibiotic Resistance (AR) is a critical global health challenge that necessitates the development of cost-effective, efficient, and accurate diagnostic tools. Given the genetic basis of AR, techniques such as Polymerase Chain Reaction (PCR) that target specific resistance genes offer a promising approach for predictive diagnostics using a limited set of key genes. This study introduces GenoARM, a novel framework that integrates reinforcement learning (RL) with transformer-based models to optimize the selection of PCR gene tests and improve AR predictions, leveraging observed metadata for improved accuracy. In our evaluation, we developed several high-performing baselines and compared them using publicly available datasets derived from real-world bacterial samples representing multiple clinically relevant pathogens. The results show that all evaluated methods achieve strong and reliable performance when metadata is not utilized. When metadata is introduced and the number of selected genes increases, GenoARM demonstrates superior performance due to its capacity to approximate rewards for unseen and sparse combinations. Overall, our framework represents a major advancement in optimizing diagnostic tools for AR in clinical settings.
