Table of Contents
Fetching ...

Hammering the Diagnosis: Rowhammer-Induced Stealthy Trojan Attacks on ViT-Based Medical Imaging

Banafsheh Saber Latibari, Najmeh Nazari, Hossein Sayadi, Houman Homayoun, Abhijit Mahalanobis

TL;DR

This work studies a hardware–level attack on Vision Transformer–based medical imaging by introducing Med-Hammer, a threat model that uses Rowhammer-induced bit flips to implant neural Trojans in deployed ViT models. The approach demonstrates that malicious weight perturbations can trigger targeted misdiagnosis only under a trigger, preserving benign diagnostic performance and enabling stealthy manipulation across ISIC, Brain Tumor, and MedMNIST datasets. Extensive experiments reveal substantial attack success rates (e.g., up to $82.52\%$ for MobileViT and $92.56\%$ for Swin) and show how architecture choices (such as attention and classifier layers) influence vulnerability, while also presenting defenses like bit-flip aware training, quantization, and NAS for resiliency. The findings highlight a critical hardware–software security gap in healthcare AI and call for cross-layer defenses spanning both neural architectures and underlying hardware to ensure reliable, safe clinical deployment.

Abstract

Vision Transformers (ViTs) have emerged as powerful architectures in medical image analysis, excelling in tasks such as disease detection, segmentation, and classification. However, their reliance on large, attention-driven models makes them vulnerable to hardware-level attacks. In this paper, we propose a novel threat model referred to as Med-Hammer that combines the Rowhammer hardware fault injection with neural Trojan attacks to compromise the integrity of ViT-based medical imaging systems. Specifically, we demonstrate how malicious bit flips induced via Rowhammer can trigger implanted neural Trojans, leading to targeted misclassification or suppression of critical diagnoses (e.g., tumors or lesions) in medical scans. Through extensive experiments on benchmark medical imaging datasets such as ISIC, Brain Tumor, and MedMNIST, we show that such attacks can remain stealthy while achieving high attack success rates about 82.51% and 92.56% in MobileViT and SwinTransformer, respectively. We further investigate how architectural properties, such as model sparsity, attention weight distribution, and the number of features of the layer, impact attack effectiveness. Our findings highlight a critical and underexplored intersection between hardware-level faults and deep learning security in healthcare applications, underscoring the urgent need for robust defenses spanning both model architectures and underlying hardware platforms.

Hammering the Diagnosis: Rowhammer-Induced Stealthy Trojan Attacks on ViT-Based Medical Imaging

TL;DR

This work studies a hardware–level attack on Vision Transformer–based medical imaging by introducing Med-Hammer, a threat model that uses Rowhammer-induced bit flips to implant neural Trojans in deployed ViT models. The approach demonstrates that malicious weight perturbations can trigger targeted misdiagnosis only under a trigger, preserving benign diagnostic performance and enabling stealthy manipulation across ISIC, Brain Tumor, and MedMNIST datasets. Extensive experiments reveal substantial attack success rates (e.g., up to for MobileViT and for Swin) and show how architecture choices (such as attention and classifier layers) influence vulnerability, while also presenting defenses like bit-flip aware training, quantization, and NAS for resiliency. The findings highlight a critical hardware–software security gap in healthcare AI and call for cross-layer defenses spanning both neural architectures and underlying hardware to ensure reliable, safe clinical deployment.

Abstract

Vision Transformers (ViTs) have emerged as powerful architectures in medical image analysis, excelling in tasks such as disease detection, segmentation, and classification. However, their reliance on large, attention-driven models makes them vulnerable to hardware-level attacks. In this paper, we propose a novel threat model referred to as Med-Hammer that combines the Rowhammer hardware fault injection with neural Trojan attacks to compromise the integrity of ViT-based medical imaging systems. Specifically, we demonstrate how malicious bit flips induced via Rowhammer can trigger implanted neural Trojans, leading to targeted misclassification or suppression of critical diagnoses (e.g., tumors or lesions) in medical scans. Through extensive experiments on benchmark medical imaging datasets such as ISIC, Brain Tumor, and MedMNIST, we show that such attacks can remain stealthy while achieving high attack success rates about 82.51% and 92.56% in MobileViT and SwinTransformer, respectively. We further investigate how architectural properties, such as model sparsity, attention weight distribution, and the number of features of the layer, impact attack effectiveness. Our findings highlight a critical and underexplored intersection between hardware-level faults and deep learning security in healthcare applications, underscoring the urgent need for robust defenses spanning both model architectures and underlying hardware platforms.

Paper Structure

This paper contains 20 sections, 1 equation, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Comparison of attack surfaces in ViT-based medical imaging. Previous works primarily target the input space through adversarial perturbations, while our work introduces a novel hardware-level threat by exploiting Rowhammer-induced bit flips in memory, enabling stealthy Trojan implantation without altering the input scan.
  • Figure 2: Illustration of the Rowhammer effect in DRAM. Repeated activation of aggressor rows induces electrical interference in adjacent victim rows, leading to deterministic bit flips.
  • Figure 3: Overview of Med-Hammer: Workflow of Trojan implantation via Rowhammer-induced bit flips. A clean ViT is first stored in memory with intact weights. The adversary exploits the Rowhammer effect to flip selected bits in DRAM, altering parameters in attention or embedding layers. The resulting Trojaned ViT preserves normal diagnostic accuracy for benign inputs but produces targeted misdiagnoses when presented with adversary-chosen triggers.
  • Figure 4: A Clean ViT model can extract the attentive regions.
  • Figure 5: Layer-wise accuracy degradation (%) in MobileViT after injecting 20 targeted bit-flips.
  • ...and 1 more figures