Bayesian-Guided Generation of Synthetic Microbiomes with Minimized Pathogenicity
Nisha Pillai, Bindu Nanduri, Michael J Rothrock, Zhiqian Chen, Mahalingam Ramkumar
TL;DR
The paper tackles the challenge of mitigating multidrug resistance (MDR) by generating synthetic microbiomes with minimized pathogenic potential. It integrates an autoencoder-based latent-space representation with Bayesian optimization to efficiently search microbiome designs, using Gaussian process models and acquisition functions (EI, UCB, PI, TS) to guide decoding of latent samples toward low-MDR predictions. Diversity-driven sampling via a determinantal point process and oversampling with SMOTE improve learning under limited data, while the latent-space decoder enables rapid generation and screening of synthetic microbiomes. The approach demonstrates reduced search iterations and identifies microbial taxa associated with lowered MDR, offering a pathway to bespoke microbiome designs with potential applications in poultry health and food safety.
Abstract
Synthetic microbiomes offer new possibilities for modulating microbiota, to address the barriers in multidtug resistance (MDR) research. We present a Bayesian optimization approach to enable efficient searching over the space of synthetic microbiome variants to identify candidates predictive of reduced MDR. Microbiome datasets were encoded into a low-dimensional latent space using autoencoders. Sampling from this space allowed generation of synthetic microbiome signatures. Bayesian optimization was then implemented to select variants for biological screening to maximize identification of designs with restricted MDR pathogens based on minimal samples. Four acquisition functions were evaluated: expected improvement, upper confidence bound, Thompson sampling, and probability of improvement. Based on each strategy, synthetic samples were prioritized according to their MDR detection. Expected improvement, upper confidence bound, and probability of improvement consistently produced synthetic microbiome candidates with significantly fewer searches than Thompson sampling. By combining deep latent space mapping and Bayesian learning for efficient guided screening, this study demonstrated the feasibility of creating bespoke synthetic microbiomes with customized MDR profiles.
