Everyone's Privacy Matters! An Analysis of Privacy Leakage from Real-World Facial Images on Twitter and Associated User Behaviors

Yuqi Niu; Weidong Qiu; Peng Tang; Lifan Wang; Shuo Chen; Shujun Li; Nadin Kokciyan; Ben Niu

Everyone's Privacy Matters! An Analysis of Privacy Leakage from Real-World Facial Images on Twitter and Associated User Behaviors

Yuqi Niu, Weidong Qiu, Peng Tang, Lifan Wang, Shuo Chen, Shujun Li, Nadin Kokciyan, Ben Niu

TL;DR

This paper tackles face privacy leakage on OSNs by building a novel bystander-subject classifier based on face-based features and by deploying a semi-automated, large-scale analysis framework. Using 27,800 Twitter images from 6,423 users, the authors validate the classifier against prior state-of-the-art methods and demonstrate its robustness across OSN and non-OSN images. The study reveals eight key findings about uploader behaviors, anonymization practices, and potential leakage of sensitive social attributes, underscoring the need for privacy-aware tools on platforms and awareness among users. Practically, the work enables OSNs to deploy automated bystander detection and motivates both policy and tool development to protect non-consenting individuals in shared images.

Abstract

Online users often post facial images of themselves and other people on online social networks (OSNs) and other Web 2.0 platforms, which can lead to potential privacy leakage of people whose faces are included in such images. There is limited research on understanding face privacy in social media while considering user behavior. It is crucial to consider privacy of subjects and bystanders separately. This calls for the development of privacy-aware face detection classifiers that can distinguish between subjects and bystanders automatically. This paper introduces such a classifier trained on face-based features, which outperforms the two state-of-the-art methods with a significant margin (by 13.1% and 3.1% for OSN images, and by 17.9% and 5.9% for non-OSN images). We developed a semi-automated framework for conducting a large-scale analysis of the face privacy problem by using our novel bystander-subject classifier. We collected 27,800 images, each including at least one face, shared by 6,423 Twitter users. We then applied our framework to analyze this dataset thoroughly. Our analysis reveals eight key findings of different aspects of Twitter users' real-world behaviors on face privacy, and we provide quantitative and qualitative results to better explain these findings. We share the practical implications of our study to empower online platforms and users in addressing the face privacy problem efficiently.

Everyone's Privacy Matters! An Analysis of Privacy Leakage from Real-World Facial Images on Twitter and Associated User Behaviors

TL;DR

Abstract

Paper Structure (48 sections, 1 equation, 8 figures, 15 tables)

This paper contains 48 sections, 1 equation, 8 figures, 15 tables.

Introduction
Related Work
Automatic Bystander Detection in Images
Image Privacy Protection Solutions
Privacy-related Analysis on OSNs
Understanding Face Privacy in OSNs
Definitions: Subjects and Bystanders
Face Privacy Issues of Bystanders and Subjects
Uploaders' Behaviors about Face Privacy
Advancing the State-of-the-Art
Determining Subjects
Face Privacy Datasets
Subjects and Bystanders Classification
Methodology
Face Detection
...and 33 more sections

Figures (8)

Figure 1: Examples of user anonymized, partially anonymized, and fully anonymized faces.
Figure 2: An example image showing how faces in the images are positioned in one of the 9 regions and how to calculate the number of faces in each region. The two subjects (highlighted in the green box), who are clearly posing for the camera, are located in Region 2. The bystander (highlighted in a red box), who shows no indication of willingness to participate in the filming based on visual cues, is located in Region 3. Regions 1, 4, 5, 6, 7, 8, and 9 each have a face count of 0, while Region 2 has a face count of 2, and Region 3 has a face count of 1.
Figure 3: An overview of our proposed bystander-subject classifier.
Figure 4: Some example images in Dataset 1.
Figure 5: Examples of correct and incorrect classification results: red boxes -- correctly classified bystanders; green boxes -- correctly classified subjects; yellow boxes -- subjects that are misclassified as bystanders; and blue boxes -- bystanders that are misclassified as subjects.
...and 3 more figures

Everyone's Privacy Matters! An Analysis of Privacy Leakage from Real-World Facial Images on Twitter and Associated User Behaviors

TL;DR

Abstract

Everyone's Privacy Matters! An Analysis of Privacy Leakage from Real-World Facial Images on Twitter and Associated User Behaviors

Authors

TL;DR

Abstract

Table of Contents

Figures (8)