Table of Contents
Fetching ...

Enhancing Adversarial Robustness of Vision-Language Models through Low-Rank Adaptation

Yuheng Ji, Yue Liu, Zhicheng Zhang, Zhao Zhang, Yuting Zhao, Xiaoshuai Hao, Gang Zhou, Xingwei Zhang, Xiaolong Zheng

TL;DR

Vision-Language Models face security risks when adapting to new tasks, especially under adversarial inputs. The authors propose AdvLoRA, a parameter-efficient adversarial adaptation method built on Low-Rank Adaptation with clustering-based reparameterization and adaptive update, to improve robustness with far fewer tunable parameters. Across image-text and video-text retrieval benchmarks, AdvLoRA outperforms full fine-tuning and other PEFT baselines under PGD-type attacks, while revealing a low-rank structure in adversarial adaptation. This work offers a scalable defense strategy for VLMs relevant to AGI security and suggests directions for memory-efficient adversarial training.

Abstract

Vision-Language Models (VLMs) play a crucial role in the advancement of Artificial General Intelligence (AGI). As AGI rapidly evolves, addressing security concerns has emerged as one of the most significant challenges for VLMs. In this paper, we present extensive experiments that expose the vulnerabilities of conventional adaptation methods for VLMs, highlighting significant security risks. Moreover, as VLMs grow in size, the application of traditional adversarial adaptation techniques incurs substantial computational costs. To address these issues, we propose a parameter-efficient adversarial adaptation method called \textbf{\textit{AdvLoRA}} based on Low-Rank Adaptation. We investigate and reveal the inherent low-rank properties involved in adversarial adaptation for VLMs. Different from LoRA, we enhance the efficiency and robustness of adversarial adaptation by introducing a novel reparameterization method that leverages parameter clustering and alignment. Additionally, we propose an adaptive parameter update strategy to further bolster robustness. These innovations enable our AdvLoRA to mitigate issues related to model security and resource wastage. Extensive experiments confirm the effectiveness and efficiency of AdvLoRA.

Enhancing Adversarial Robustness of Vision-Language Models through Low-Rank Adaptation

TL;DR

Vision-Language Models face security risks when adapting to new tasks, especially under adversarial inputs. The authors propose AdvLoRA, a parameter-efficient adversarial adaptation method built on Low-Rank Adaptation with clustering-based reparameterization and adaptive update, to improve robustness with far fewer tunable parameters. Across image-text and video-text retrieval benchmarks, AdvLoRA outperforms full fine-tuning and other PEFT baselines under PGD-type attacks, while revealing a low-rank structure in adversarial adaptation. This work offers a scalable defense strategy for VLMs relevant to AGI security and suggests directions for memory-efficient adversarial training.

Abstract

Vision-Language Models (VLMs) play a crucial role in the advancement of Artificial General Intelligence (AGI). As AGI rapidly evolves, addressing security concerns has emerged as one of the most significant challenges for VLMs. In this paper, we present extensive experiments that expose the vulnerabilities of conventional adaptation methods for VLMs, highlighting significant security risks. Moreover, as VLMs grow in size, the application of traditional adversarial adaptation techniques incurs substantial computational costs. To address these issues, we propose a parameter-efficient adversarial adaptation method called \textbf{\textit{AdvLoRA}} based on Low-Rank Adaptation. We investigate and reveal the inherent low-rank properties involved in adversarial adaptation for VLMs. Different from LoRA, we enhance the efficiency and robustness of adversarial adaptation by introducing a novel reparameterization method that leverages parameter clustering and alignment. Additionally, we propose an adaptive parameter update strategy to further bolster robustness. These innovations enable our AdvLoRA to mitigate issues related to model security and resource wastage. Extensive experiments confirm the effectiveness and efficiency of AdvLoRA.
Paper Structure (11 sections, 13 equations, 5 figures, 10 tables, 1 algorithm)

This paper contains 11 sections, 13 equations, 5 figures, 10 tables, 1 algorithm.

Figures (5)

  • Figure 1: Vulnerability of vision-language model adaptation methods to natural and adversarial data in MSCOCO (image-text data) lin2014microsoft and MSR-VTT (video-text data) xu2016msr datasets.
  • Figure 2: Adversarial robustness and tunable parameter number of adversarial adaptation methods on two dataset.
  • Figure 3: Sensitivity Analysis.
  • Figure 4: Loss Analysis.
  • Figure 5: Case study of MSR-VTT. We sample and visualize eight frames from the videos. The frames with the devil denote that they are under the adversarial attacks.