Table of Contents
Fetching ...

Private Attribute Inference from Images with Vision-Language Models

Batuhan Tömekçe, Mark Vero, Robin Staab, Martin Vechev

TL;DR

An image dataset with human-annotated labels of the image owner's personal attributes is compiled, finding that accuracy scales with the general capabilities of the models, implying that future models can be misused as stronger inferential adversaries, establishing an imperative for the development of adequate defenses.

Abstract

As large language models (LLMs) become ubiquitous in our daily tasks and digital interactions, associated privacy risks are increasingly in focus. While LLM privacy research has primarily focused on the leakage of model training data, it has recently been shown that LLMs can make accurate privacy-infringing inferences from previously unseen texts. With the rise of vision-language models (VLMs), capable of understanding both images and text, a key question is whether this concern transfers to the previously unexplored domain of benign images posted online. To answer this question, we compile an image dataset with human-annotated labels of the image owner's personal attributes. In order to understand the privacy risks posed by VLMs beyond traditional human attribute recognition, our dataset consists of images where the inferable private attributes do not stem from direct depictions of humans. On this dataset, we evaluate 7 state-of-the-art VLMs, finding that they can infer various personal attributes at up to 77.6% accuracy. Concerningly, we observe that accuracy scales with the general capabilities of the models, implying that future models can be misused as stronger inferential adversaries, establishing an imperative for the development of adequate defenses.

Private Attribute Inference from Images with Vision-Language Models

TL;DR

An image dataset with human-annotated labels of the image owner's personal attributes is compiled, finding that accuracy scales with the general capabilities of the models, implying that future models can be misused as stronger inferential adversaries, establishing an imperative for the development of adequate defenses.

Abstract

As large language models (LLMs) become ubiquitous in our daily tasks and digital interactions, associated privacy risks are increasingly in focus. While LLM privacy research has primarily focused on the leakage of model training data, it has recently been shown that LLMs can make accurate privacy-infringing inferences from previously unseen texts. With the rise of vision-language models (VLMs), capable of understanding both images and text, a key question is whether this concern transfers to the previously unexplored domain of benign images posted online. To answer this question, we compile an image dataset with human-annotated labels of the image owner's personal attributes. In order to understand the privacy risks posed by VLMs beyond traditional human attribute recognition, our dataset consists of images where the inferable private attributes do not stem from direct depictions of humans. On this dataset, we evaluate 7 state-of-the-art VLMs, finding that they can infer various personal attributes at up to 77.6% accuracy. Concerningly, we observe that accuracy scales with the general capabilities of the models, implying that future models can be misused as stronger inferential adversaries, establishing an imperative for the development of adequate defenses.
Paper Structure (65 sections, 5 figures, 11 tables)

This paper contains 65 sections, 5 figures, 11 tables.

Figures (5)

  • Figure 1: Shortened example inference over an image using GPT4-V. The model recognizes the logo of the football team hanging on the wall and infers that the inhabitant of this dorm room is likely from Wisconsin, while also providing adequate reasoning. The person in the picture is occluded.
  • Figure 2: Illustrative example of GPT4-V recognizing that an item that is too small in the current resolution could provide it with more information about the inference task. The model is capable of returning a bounding box that can be used to crop the image before returning it for repeated processing.
  • Figure 3: Our data collection and labeling pipeline. In step 1, we collect images from a carefully selected set of subreddits that may contain images suitable for our task. Then, in step 2, we label the images manually while allowing the labeler to access online search for assistance. Finally, in step 3, we extract the comments of the profile that posted the image and keep only the obtained image labels that are not contradicted by the information contained in the comments. Note that we hide the true information on the tag and report an alternative location in the example.
  • Figure 4: Comparison of the private attribute inference capabilities of all examined models on our collected Vision Inference-Privacy (VIP) dataset. GPT4-V is clearly the strongest model, with an accuracy of 77.6%, while the best open-source model, CogAgent-VQA achieves 66.4% accuracy.
  • Figure 5: Impact of different prompting strategies on the inferene accuracy across attributes for GPT4-V.