Table of Contents
Fetching ...

Auditing M-LLMs for Privacy Risks: A Synthetic Benchmark and Evaluation Framework

Junhao Li, Jiahao Chen, Zhou Feng, Chunyi Zhou

TL;DR

The paper tackles privacy risks posed by cross-modal inference in multi-modal LLMs (M-LLMs) on social media. It introduces PRISM, a large-scale synthetic, multi-modal benchmark with a prior-driven profile generator and 12 private attributes, evaluated via a three-node Multi-Agent Inference Architecture. Six leading M-LLMs are benchmarked against human performance, revealing strong inference capabilities, with visual data dramatically boosting accuracy and often surpassing humans in both accuracy and efficiency. The authors discuss defense strategies, arguing for layered defenses that combine user-facing warnings with advanced unlearning methods, and provide PRISM as a public resource to spur robust defense research against cross-modal privacy leakage.

Abstract

Recent advances in multi-modal Large Language Models (M-LLMs) have demonstrated a powerful ability to synthesize implicit information from disparate sources, including images and text. These resourceful data from social media also introduce a significant and underexplored privacy risk: the inference of sensitive personal attributes from seemingly daily media content. However, the lack of benchmarks and comprehensive evaluations of state-of-the-art M-LLM capabilities hinders the research of private attribute profiling on social media. Accordingly, we propose (1) PRISM, the first multi-modal, multi-dimensional and fine-grained synthesized dataset incorporating a comprehensive privacy landscape and dynamic user history; (2) an Efficient evaluation framework that measures the cross-modal privacy inference capabilities of advanced M-LLM. Specifically, PRISM is a large-scale synthetic benchmark designed to evaluate cross-modal privacy risks. Its key feature is 12 sensitive attribute labels across a diverse set of multi-modal profiles, which enables targeted privacy analysis. These profiles are generated via a sophisticated LLM agentic workflow, governed by a prior distribution to ensure they realistically mimic social media users. Additionally, we propose a Multi-Agent Inference Framework that leverages a pipeline of specialized LLMs to enhance evaluation capabilities. We evaluate the inference capabilities of six leading M-LLMs (Qwen, Gemini, GPT-4o, GLM, Doubao, and Grok) on PRISM. The comparison with human performance reveals that these MLLMs significantly outperform in accuracy and efficiency, highlighting the threat of potential privacy risks and the urgent need for robust defenses. Dataset available at https://huggingface.co/datasets/xaddh/multimodal-privacy

Auditing M-LLMs for Privacy Risks: A Synthetic Benchmark and Evaluation Framework

TL;DR

The paper tackles privacy risks posed by cross-modal inference in multi-modal LLMs (M-LLMs) on social media. It introduces PRISM, a large-scale synthetic, multi-modal benchmark with a prior-driven profile generator and 12 private attributes, evaluated via a three-node Multi-Agent Inference Architecture. Six leading M-LLMs are benchmarked against human performance, revealing strong inference capabilities, with visual data dramatically boosting accuracy and often surpassing humans in both accuracy and efficiency. The authors discuss defense strategies, arguing for layered defenses that combine user-facing warnings with advanced unlearning methods, and provide PRISM as a public resource to spur robust defense research against cross-modal privacy leakage.

Abstract

Recent advances in multi-modal Large Language Models (M-LLMs) have demonstrated a powerful ability to synthesize implicit information from disparate sources, including images and text. These resourceful data from social media also introduce a significant and underexplored privacy risk: the inference of sensitive personal attributes from seemingly daily media content. However, the lack of benchmarks and comprehensive evaluations of state-of-the-art M-LLM capabilities hinders the research of private attribute profiling on social media. Accordingly, we propose (1) PRISM, the first multi-modal, multi-dimensional and fine-grained synthesized dataset incorporating a comprehensive privacy landscape and dynamic user history; (2) an Efficient evaluation framework that measures the cross-modal privacy inference capabilities of advanced M-LLM. Specifically, PRISM is a large-scale synthetic benchmark designed to evaluate cross-modal privacy risks. Its key feature is 12 sensitive attribute labels across a diverse set of multi-modal profiles, which enables targeted privacy analysis. These profiles are generated via a sophisticated LLM agentic workflow, governed by a prior distribution to ensure they realistically mimic social media users. Additionally, we propose a Multi-Agent Inference Framework that leverages a pipeline of specialized LLMs to enhance evaluation capabilities. We evaluate the inference capabilities of six leading M-LLMs (Qwen, Gemini, GPT-4o, GLM, Doubao, and Grok) on PRISM. The comparison with human performance reveals that these MLLMs significantly outperform in accuracy and efficiency, highlighting the threat of potential privacy risks and the urgent need for robust defenses. Dataset available at https://huggingface.co/datasets/xaddh/multimodal-privacy

Paper Structure

This paper contains 18 sections, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Scenario Overview. A photograph of an overseas vacation not only directly exposes the user’s appearance and location, but may also be inferred as personal attributes such as hobbies and income.
  • Figure 2: Results from user study with 569 participants, highlighting public perception of multi-modal privacy risks. Left: User concern levels regarding the leakage of their personal information. Right: User perception of the adequacy of current platform privacy protections.
  • Figure 3: The data generation workflow of PRISM begins by creating a realistic user Profile from a controllable Prior Distribution. Specialized generators then synthesize corresponding Events, Topics, Posts, and a Caption. Finally, an Image Generator renders a high-fidelity image from the caption, creating a complete and contextually consistent multi-modal post.