The Double-Edged Sword of Data-Driven Super-Resolution: Adversarial Super-Resolution Models
Haley Duba-Sullivan, Steven R. Young, Emma J. Reid
TL;DR
AdvSR exposes a model-level vulnerability in data-driven super-resolution pipelines by embedding a targeted adversarial objective directly into SR weights, obviating test-time input manipulation. The method optimizes a combined objective $\mathcal{L}_\phi = \mathcal{L}_{AdvCE} + \lambda \mathcal{L}_{SR}$, where the adversarial term $\mathcal{L}_{AdvCE}$ uses modified labels to misclassify a source class $s$ as a target class $t$ while preserving others, and $\lambda$ is balanced via $\lambda = r \cdot \frac{\mathcal{L}_{AdvCE}^{(0)}}{\mathcal{L}_{SR}^{(0)}}$. Experiments on SRCNN, EDSR, and SwinIR with a YOLOv11 downstream classifier show that AdvSR can achieve high Targeted-ASR (up to ~82%) with minimal degradation in PSNR/SSIM and high non-source accuracy, especially for high-capacity SR models like SwinIR. This work highlights a new supply-chain and model-robustness threat in safety-critical imaging pipelines and motivates defenses and broader evaluations across architectures and data distributions.
Abstract
Data-driven super-resolution (SR) methods are often integrated into imaging pipelines as preprocessing steps to improve downstream tasks such as classification and detection. However, these SR models introduce a previously unexplored attack surface into imaging pipelines. In this paper, we present AdvSR, a framework demonstrating that adversarial behavior can be embedded directly into SR model weights during training, requiring no access to inputs at inference time. Unlike prior attacks that perturb inputs or rely on backdoor triggers, AdvSR operates entirely at the model level. By jointly optimizing for reconstruction quality and targeted adversarial outcomes, AdvSR produces models that appear benign under standard image quality metrics while inducing downstream misclassification. We evaluate AdvSR on three SR architectures (SRCNN, EDSR, SwinIR) paired with a YOLOv11 classifier and demonstrate that AdvSR models can achieve high attack success rates with minimal quality degradation. These findings highlight a new model-level threat for imaging pipelines, with implications for how practitioners source and validate models in safety-critical applications.
