Table of Contents
Fetching ...

Who Can See Through You? Adversarial Shielding Against VLM-Based Attribute Inference Attacks

Yucheng Fan, Jiawei Chen, Yu Tian, Zhaoxia Yin

TL;DR

The paper tackles the privacy risks of VLM-based attribute inference from user-shared images by proposing an input-level joint protection framework that balances privacy suppression with utility preservation under a visual-consistency constraint. It introduces VPI-COCO, a public benchmark with hierarchical privacy questions and non-privacy counterparts to enable rigorous, dual-evaluation of protection methods. Empirical results across multiple vision-language models show the proposed method achieves strong privacy protection (low PAR) while maintaining high utility (NPAR) and visual fidelity (PSNR/SSIM), with analyses on transferability and attribute-level behavior. The work provides practical protection for real-world VLM deployments and supplies reproducible evaluation resources, though cross-model generalization remains an open area for future work.

Abstract

As vision-language models (VLMs) become widely adopted, VLM-based attribute inference attacks have emerged as a serious privacy concern, enabling adversaries to infer private attributes from images shared on social media. This escalating threat calls for dedicated protection methods to safeguard user privacy. However, existing methods often degrade the visual quality of images or interfere with vision-based functions on social media, thereby failing to achieve a desirable balance between privacy protection and user experience. To address this challenge, we propose a novel protection method that jointly optimizes privacy suppression and utility preservation under a visual consistency constraint. While our method is conceptually effective, fair comparisons between methods remain challenging due to the lack of publicly available evaluation datasets. To fill this gap, we introduce VPI-COCO, a publicly available benchmark comprising 522 images with hierarchically structured privacy questions and corresponding non-private counterparts, enabling fine-grained and joint evaluation of protection methods in terms of privacy preservation and user experience. Building upon this benchmark, experiments on multiple VLMs demonstrate that our method effectively reduces PAR below 25%, keeps NPAR above 88%, maintains high visual consistency, and generalizes well to unseen and paraphrased privacy questions, demonstrating its strong practical applicability for real-world VLM deployments.

Who Can See Through You? Adversarial Shielding Against VLM-Based Attribute Inference Attacks

TL;DR

The paper tackles the privacy risks of VLM-based attribute inference from user-shared images by proposing an input-level joint protection framework that balances privacy suppression with utility preservation under a visual-consistency constraint. It introduces VPI-COCO, a public benchmark with hierarchical privacy questions and non-privacy counterparts to enable rigorous, dual-evaluation of protection methods. Empirical results across multiple vision-language models show the proposed method achieves strong privacy protection (low PAR) while maintaining high utility (NPAR) and visual fidelity (PSNR/SSIM), with analyses on transferability and attribute-level behavior. The work provides practical protection for real-world VLM deployments and supplies reproducible evaluation resources, though cross-model generalization remains an open area for future work.

Abstract

As vision-language models (VLMs) become widely adopted, VLM-based attribute inference attacks have emerged as a serious privacy concern, enabling adversaries to infer private attributes from images shared on social media. This escalating threat calls for dedicated protection methods to safeguard user privacy. However, existing methods often degrade the visual quality of images or interfere with vision-based functions on social media, thereby failing to achieve a desirable balance between privacy protection and user experience. To address this challenge, we propose a novel protection method that jointly optimizes privacy suppression and utility preservation under a visual consistency constraint. While our method is conceptually effective, fair comparisons between methods remain challenging due to the lack of publicly available evaluation datasets. To fill this gap, we introduce VPI-COCO, a publicly available benchmark comprising 522 images with hierarchically structured privacy questions and corresponding non-private counterparts, enabling fine-grained and joint evaluation of protection methods in terms of privacy preservation and user experience. Building upon this benchmark, experiments on multiple VLMs demonstrate that our method effectively reduces PAR below 25%, keeps NPAR above 88%, maintains high visual consistency, and generalizes well to unseen and paraphrased privacy questions, demonstrating its strong practical applicability for real-world VLM deployments.

Paper Structure

This paper contains 34 sections, 6 equations, 8 figures, 7 tables, 1 algorithm.

Figures (8)

  • Figure 1: Illustration of VLM-based attribute inference attack. Users often share daily-life photos on social media. Attackers can exploit VLMs to infer personal privacy attributes from visual cues, even when such attributes are never explicitly disclosed.
  • Figure 2: Overview of the VPI-COCO dataset construction pipeline. Step 1 selects candidate images from COCO that exhibit social-media characteristics and privacy-inference potential. Step 2 generates two types of questions: non-privacy questions are extracted from the VQA dataset, while privacy questions are generated through a structured pipeline that infers privacy semantic tuples from each image and formulates them into basic and scene-level questions guided by predefined templates.
  • Figure 3: Cross-question transfer results on privacy questions. Our method maintains low PAR across selected, unselected, and paraphrased question sets, demonstrating strong generalization and consistent privacy protection against unseen or rephrased questions.
  • Figure 4: Cross-question transfer on non-privacy questions. Our method keeps NPAR high and stable under all question sets, confirming robust utility preservation even for unseen or paraphrased questions.
  • Figure 5: Cross-model PAR on selected privacy questions.
  • ...and 3 more figures