Table of Contents
Fetching ...

OmegAMP: Targeted AMP Discovery through Biologically Informed Generation

Diogo Soares, Leon Hetzel, Paulina Szymczak, Marcelo Der Torossian Torres, Johanna Sommer, Cesar de la Fuente-Nunez, Fabian Theis, Stephan Günnemann, Ewa Szczurek

TL;DR

OmegAMP addresses the urgency of antimicrobial resistance by introducing a diffusion-based AMP generator augmented with a biologically informed embedding and a flexible conditioning scheme to control key physicochemical properties and species-specific activity. It couples this generative capability with robust synthetic-negative augmented classifiers to drastically reduce false positives, bridging in silico design and wet-lab validation. The approach achieves state-of-the-art performance in generation and classification, and demonstrates a 96% experimental hit rate with potent activity against MDR pathogens. This combination of targeted generation, disciplined filtering, and empirical validation underscores OmegAMP’s potential to accelerate discovery of therapeutically relevant AMPs.

Abstract

Deep learning-based antimicrobial peptide (AMP) discovery faces critical challenges such as limited controllability, lack of representations that efficiently model antimicrobial properties, and low experimental hit rates. To address these challenges, we introduce OmegAMP, a framework designed for reliable AMP generation with increased controllability. Its diffusion-based generative model leverages a novel conditioning mechanism to achieve fine-grained control over desired physicochemical properties and to direct generation towards specific activity profiles, including species-specific effectiveness. This is further enhanced by a biologically informed encoding space that significantly improves overall generative performance. Complementing these generative capabilities, OmegAMP leverages a novel synthetic data augmentation strategy to train classifiers for AMP filtering, drastically reducing false positive rates and thereby increasing the likelihood of experimental success. Our in silico experiments demonstrate that OmegAMP delivers state-of-the-art performance across key stages of the AMP discovery pipeline, enabling us to achieve an unprecedented success rate in wet lab experiments. We tested 25 candidate peptides, 24 of them (96%) demonstrated antimicrobial activity, proving effective even against multi-drug resistant strains. Our findings underscore OmegAMP's potential to significantly advance computational frameworks in the fight against antimicrobial resistance.

OmegAMP: Targeted AMP Discovery through Biologically Informed Generation

TL;DR

OmegAMP addresses the urgency of antimicrobial resistance by introducing a diffusion-based AMP generator augmented with a biologically informed embedding and a flexible conditioning scheme to control key physicochemical properties and species-specific activity. It couples this generative capability with robust synthetic-negative augmented classifiers to drastically reduce false positives, bridging in silico design and wet-lab validation. The approach achieves state-of-the-art performance in generation and classification, and demonstrates a 96% experimental hit rate with potent activity against MDR pathogens. This combination of targeted generation, disciplined filtering, and empirical validation underscores OmegAMP’s potential to accelerate discovery of therapeutically relevant AMPs.

Abstract

Deep learning-based antimicrobial peptide (AMP) discovery faces critical challenges such as limited controllability, lack of representations that efficiently model antimicrobial properties, and low experimental hit rates. To address these challenges, we introduce OmegAMP, a framework designed for reliable AMP generation with increased controllability. Its diffusion-based generative model leverages a novel conditioning mechanism to achieve fine-grained control over desired physicochemical properties and to direct generation towards specific activity profiles, including species-specific effectiveness. This is further enhanced by a biologically informed encoding space that significantly improves overall generative performance. Complementing these generative capabilities, OmegAMP leverages a novel synthetic data augmentation strategy to train classifiers for AMP filtering, drastically reducing false positive rates and thereby increasing the likelihood of experimental success. Our in silico experiments demonstrate that OmegAMP delivers state-of-the-art performance across key stages of the AMP discovery pipeline, enabling us to achieve an unprecedented success rate in wet lab experiments. We tested 25 candidate peptides, 24 of them (96%) demonstrated antimicrobial activity, proving effective even against multi-drug resistant strains. Our findings underscore OmegAMP's potential to significantly advance computational frameworks in the fight against antimicrobial resistance.

Paper Structure

This paper contains 74 sections, 17 equations, 5 figures, 18 tables, 2 algorithms.

Figures (5)

  • Figure 1: OmegAMP provides practitioners the ability to generate AMPs conditioned on key physicochemical properties, like length, charge, and hydrophobicity. Our generative model enables more complex objective targeting via Property and Subset conditioning.
  • Figure 2: a) Subset conditioning shows that generating sequences based on those active against a specific species increases the likelihood of producing active sequences. b) Property conditioning reliably generates peptides with charge and hydrophobicity values that approximate the pre-specified target.
  • Figure 3: AMP success rate across various MIC thresholds for OmegAMP and baseline methods.
  • Figure 4: Empirical distributions of physicochemical (charge, hydrophobicity) and model-derived (fitness score, pseudo perplexity) characteristics for natural EV AMPs, EV non-AMPs, and synthetic sequences. Natural AMPs display higher fitness scores and lower pseudo perplexity when compared to other synthetic sequences.
  • Figure 5: Amino acid frequency distribution comparison between OmegAMP-generated sequences and AMP training data. The close alignment shows that OmegAMP captures key AMP sequence features, ensuring biologically relevant generation.

Theorems & Definitions (2)

  • Definition 1
  • Definition 2