Hammering the Diagnosis: Rowhammer-Induced Stealthy Trojan Attacks on ViT-Based Medical Imaging
Banafsheh Saber Latibari, Najmeh Nazari, Hossein Sayadi, Houman Homayoun, Abhijit Mahalanobis
TL;DR
This work studies a hardware–level attack on Vision Transformer–based medical imaging by introducing Med-Hammer, a threat model that uses Rowhammer-induced bit flips to implant neural Trojans in deployed ViT models. The approach demonstrates that malicious weight perturbations can trigger targeted misdiagnosis only under a trigger, preserving benign diagnostic performance and enabling stealthy manipulation across ISIC, Brain Tumor, and MedMNIST datasets. Extensive experiments reveal substantial attack success rates (e.g., up to $82.52\%$ for MobileViT and $92.56\%$ for Swin) and show how architecture choices (such as attention and classifier layers) influence vulnerability, while also presenting defenses like bit-flip aware training, quantization, and NAS for resiliency. The findings highlight a critical hardware–software security gap in healthcare AI and call for cross-layer defenses spanning both neural architectures and underlying hardware to ensure reliable, safe clinical deployment.
Abstract
Vision Transformers (ViTs) have emerged as powerful architectures in medical image analysis, excelling in tasks such as disease detection, segmentation, and classification. However, their reliance on large, attention-driven models makes them vulnerable to hardware-level attacks. In this paper, we propose a novel threat model referred to as Med-Hammer that combines the Rowhammer hardware fault injection with neural Trojan attacks to compromise the integrity of ViT-based medical imaging systems. Specifically, we demonstrate how malicious bit flips induced via Rowhammer can trigger implanted neural Trojans, leading to targeted misclassification or suppression of critical diagnoses (e.g., tumors or lesions) in medical scans. Through extensive experiments on benchmark medical imaging datasets such as ISIC, Brain Tumor, and MedMNIST, we show that such attacks can remain stealthy while achieving high attack success rates about 82.51% and 92.56% in MobileViT and SwinTransformer, respectively. We further investigate how architectural properties, such as model sparsity, attention weight distribution, and the number of features of the layer, impact attack effectiveness. Our findings highlight a critical and underexplored intersection between hardware-level faults and deep learning security in healthcare applications, underscoring the urgent need for robust defenses spanning both model architectures and underlying hardware platforms.
