Table of Contents
Fetching ...

Explaining models relating objects and privacy

Alessio Xompero, Myriam Bontonou, Jean-Michel Arbona, Emmanouil Benetos, Andrea Cavallaro

TL;DR

This work addresses why image privacy classifiers rely on object-level cues by applying post-hoc explainability (Integrated Gradients) to a two-stage privacy framework that uses an object detector to provide cardinality and confidence features for each concept. A key finding is that the presence and count of people predominantly drives privacy predictions, often at the expense of recognizing other privacy-relevant cues; based on this, the authors propose two explainable, person-centric strategies that achieve performance comparable to more complex models on the PrivacyAlert dataset. The results underscore the potential of explainable-by-design approaches to guide the development of privacy warnings in image-sharing workflows and reveal biases that could be exploited or mitigated in real-world deployments. The work also sets up baselines and methodological directions for benchmarking explainability in privacy models across multiple datasets and explainability techniques.

Abstract

Accurately predicting whether an image is private before sharing it online is difficult due to the vast variety of content and the subjective nature of privacy itself. In this paper, we evaluate privacy models that use objects extracted from an image to determine why the image is predicted as private. To explain the decision of these models, we use feature-attribution to identify and quantify which objects (and which of their features) are more relevant to privacy classification with respect to a reference input (i.e., no objects localised in an image) predicted as public. We show that the presence of the person category and its cardinality is the main factor for the privacy decision. Therefore, these models mostly fail to identify private images depicting documents with sensitive data, vehicle ownership, and internet activity, or public images with people (e.g., an outdoor concert or people walking in a public space next to a famous landmark). As baselines for future benchmarks, we also devise two strategies that are based on the person presence and cardinality and achieve comparable classification performance of the privacy models.

Explaining models relating objects and privacy

TL;DR

This work addresses why image privacy classifiers rely on object-level cues by applying post-hoc explainability (Integrated Gradients) to a two-stage privacy framework that uses an object detector to provide cardinality and confidence features for each concept. A key finding is that the presence and count of people predominantly drives privacy predictions, often at the expense of recognizing other privacy-relevant cues; based on this, the authors propose two explainable, person-centric strategies that achieve performance comparable to more complex models on the PrivacyAlert dataset. The results underscore the potential of explainable-by-design approaches to guide the development of privacy warnings in image-sharing workflows and reveal biases that could be exploited or mitigated in real-world deployments. The work also sets up baselines and methodological directions for benchmarking explainability in privacy models across multiple datasets and explainability techniques.

Abstract

Accurately predicting whether an image is private before sharing it online is difficult due to the vast variety of content and the subjective nature of privacy itself. In this paper, we evaluate privacy models that use objects extracted from an image to determine why the image is predicted as private. To explain the decision of these models, we use feature-attribution to identify and quantify which objects (and which of their features) are more relevant to privacy classification with respect to a reference input (i.e., no objects localised in an image) predicted as public. We show that the presence of the person category and its cardinality is the main factor for the privacy decision. Therefore, these models mostly fail to identify private images depicting documents with sensitive data, vehicle ownership, and internet activity, or public images with people (e.g., an outdoor concert or people walking in a public space next to a famous landmark). As baselines for future benchmarks, we also devise two strategies that are based on the person presence and cardinality and achieve comparable classification performance of the privacy models.
Paper Structure (11 sections, 1 equation, 3 figures, 1 table)

This paper contains 11 sections, 1 equation, 3 figures, 1 table.

Figures (3)

  • Figure 1: Two-stage privacy method: a pre-trained object detector identifies concepts (e.g., objects, scene type) within an image and a privacy model is trained to classify an image as private or public, considering the cardinality and confidence level of the extracted objects (numbers below each object). The input image is from the PrivacyAlert dataset Zhao2022ICWSM_PrivacyAlert, with obfuscation added on the face of the person.
  • Figure 2: Sample of training images from PrivacyAlert Zhao2022ICWSM_PrivacyAlert correctly predicted as private (first row) and incorrectly predicted as private (fourth row) by the graph-agnostic baseline Dwivedi2023JMLR, with their extracted object and features (blue bar plots) and the explanation scores (red bar plots) of Integrated Gradients (IG) sundararajan2017axiomatic. Darker colours (left bar) are associated with the confidence feature and lighter colours (right bar) with cardinality. Positive IG scores support privacy, whereas negative IG scores support the public decision. Note the different maximum limit for the y-axis in the top-right blue bar plot (fifth column, second row).
  • Figure 3: Comparison of the explainability scores across training images correctly classified as private by the graph-agnostic (GA-MLP) and MLP models on the training set of PrivacyAlert Zhao2022ICWSM_PrivacyAlert. We show only the top 5 objects based on the largest mean absolute explainability scores. Note that colours of the data points represent the value of the object feature. Also note the different limits of the colour bars.