Explaining models relating objects and privacy
Alessio Xompero, Myriam Bontonou, Jean-Michel Arbona, Emmanouil Benetos, Andrea Cavallaro
TL;DR
This work addresses why image privacy classifiers rely on object-level cues by applying post-hoc explainability (Integrated Gradients) to a two-stage privacy framework that uses an object detector to provide cardinality and confidence features for each concept. A key finding is that the presence and count of people predominantly drives privacy predictions, often at the expense of recognizing other privacy-relevant cues; based on this, the authors propose two explainable, person-centric strategies that achieve performance comparable to more complex models on the PrivacyAlert dataset. The results underscore the potential of explainable-by-design approaches to guide the development of privacy warnings in image-sharing workflows and reveal biases that could be exploited or mitigated in real-world deployments. The work also sets up baselines and methodological directions for benchmarking explainability in privacy models across multiple datasets and explainability techniques.
Abstract
Accurately predicting whether an image is private before sharing it online is difficult due to the vast variety of content and the subjective nature of privacy itself. In this paper, we evaluate privacy models that use objects extracted from an image to determine why the image is predicted as private. To explain the decision of these models, we use feature-attribution to identify and quantify which objects (and which of their features) are more relevant to privacy classification with respect to a reference input (i.e., no objects localised in an image) predicted as public. We show that the presence of the person category and its cardinality is the main factor for the privacy decision. Therefore, these models mostly fail to identify private images depicting documents with sensitive data, vehicle ownership, and internet activity, or public images with people (e.g., an outdoor concert or people walking in a public space next to a famous landmark). As baselines for future benchmarks, we also devise two strategies that are based on the person presence and cardinality and achieve comparable classification performance of the privacy models.
