Table of Contents
Fetching ...

SafeGenes: Evaluating the Adversarial Robustness of Genomic Foundation Models

Huixin Zhan, Clovis Barbour, Jason H. Moore

TL;DR

SafeGenes introduces a dual adversarial framework to evaluate the robustness of genomic foundation models in variant effect prediction, using FGSM perturbations and embedding-space soft prompts. The study demonstrates pervasive vulnerabilities across CM and ARM tasks, with targeted soft prompts causing the largest degradation and even high-capacity models remaining susceptible. Cross-model transferability of adversarial prompts and gene-level fragility (e.g., MYH7, MYBPC3) reveal latent weaknesses that are not captured by conventional robustness tests. The findings argue for integrating latent-space monitoring and adversarial stress-testing into the development and deployment of genomic AI systems to ensure clinically reliable interpretations.

Abstract

Genomic Foundation Models (GFMs), such as Evolutionary Scale Modeling (ESM), have demonstrated significant success in variant effect prediction. However, their adversarial robustness remains largely unexplored. To address this gap, we propose SafeGenes: a framework for Secure analysis of genomic foundation models, leveraging adversarial attacks to evaluate robustness against both engineered near-identical adversarial Genes and embedding-space manipulations. In this study, we assess the adversarial vulnerabilities of GFMs using two approaches: the Fast Gradient Sign Method (FGSM) and a soft prompt attack. FGSM introduces minimal perturbations to input sequences, while the soft prompt attack optimizes continuous embeddings to manipulate model predictions without modifying the input tokens. By combining these techniques, SafeGenes provides a comprehensive assessment of GFM susceptibility to adversarial manipulation. Targeted soft prompt attacks induced severe degradation in MLM-based shallow architectures such as ProteinBERT, while still producing substantial failure modes even in high-capacity foundation models such as ESM1b and ESM1v. These findings expose critical vulnerabilities in current foundation models, opening new research directions toward improving their security and robustness in high-stakes genomic applications such as variant effect prediction.

SafeGenes: Evaluating the Adversarial Robustness of Genomic Foundation Models

TL;DR

SafeGenes introduces a dual adversarial framework to evaluate the robustness of genomic foundation models in variant effect prediction, using FGSM perturbations and embedding-space soft prompts. The study demonstrates pervasive vulnerabilities across CM and ARM tasks, with targeted soft prompts causing the largest degradation and even high-capacity models remaining susceptible. Cross-model transferability of adversarial prompts and gene-level fragility (e.g., MYH7, MYBPC3) reveal latent weaknesses that are not captured by conventional robustness tests. The findings argue for integrating latent-space monitoring and adversarial stress-testing into the development and deployment of genomic AI systems to ensure clinically reliable interpretations.

Abstract

Genomic Foundation Models (GFMs), such as Evolutionary Scale Modeling (ESM), have demonstrated significant success in variant effect prediction. However, their adversarial robustness remains largely unexplored. To address this gap, we propose SafeGenes: a framework for Secure analysis of genomic foundation models, leveraging adversarial attacks to evaluate robustness against both engineered near-identical adversarial Genes and embedding-space manipulations. In this study, we assess the adversarial vulnerabilities of GFMs using two approaches: the Fast Gradient Sign Method (FGSM) and a soft prompt attack. FGSM introduces minimal perturbations to input sequences, while the soft prompt attack optimizes continuous embeddings to manipulate model predictions without modifying the input tokens. By combining these techniques, SafeGenes provides a comprehensive assessment of GFM susceptibility to adversarial manipulation. Targeted soft prompt attacks induced severe degradation in MLM-based shallow architectures such as ProteinBERT, while still producing substantial failure modes even in high-capacity foundation models such as ESM1b and ESM1v. These findings expose critical vulnerabilities in current foundation models, opening new research directions toward improving their security and robustness in high-stakes genomic applications such as variant effect prediction.

Paper Structure

This paper contains 23 sections, 3 equations, 32 figures, 9 tables.

Figures (32)

  • Figure 1: Illustration of adversarial sensitivity in GFMs for variant effect prediction. (a) Conceptual schematic showing wild-type and variant embeddings in GFM representation space. Pathogenic variants are expected to maximize distance from the wild-type, while benign variants minimize distance. (b) Overview of PLLR computation using a language model. The PLL for the wild-type and mutant sequence is compared to infer the variant label. (c) FGSM attack perturbs model embeddings to falsify pathogenicity predictions, shifting benign variants to appear pathogenic and vice versa. (d) Soft prompt attack: structured prompts induce the model to shift decision boundaries, leading to adversarial misclassification even when the original label is correct.
  • Figure 5: Performance under different adversarial attack methods across CM and ARM datasets. (a) AUC on CM. (b) AUPR on CM. (c) AUC on ARM. (d) AUPR on ARM. FGSM, SPA_Confidence Hijack, and SPA_Targeted Attack all degrade model performance, with the targeted attack showing the strongest effect.
  • Figure : (a) AUC vs. Training Sample Size
  • Figure : (a) ROC curve: clean vs FGSM conditions
  • Figure : (a)
  • ...and 27 more figures