Table of Contents
Fetching ...

Setting the Course, but Forgetting to Steer: Analyzing Compliance with GDPR's Right of Access to Data by Instagram, TikTok, and YouTube

Sai Keerthana Karnam, Abhisek Dash, Antariksh Das, Sepehr Mousavi, Stefan Bechtold, Krishna P. Gummadi, Animesh Mukherjee, Ingmar Weber, Savvas Zannettou

TL;DR

This paper conducts a systematic audit of GDPR Article 15 implementations for the Right of Access across TikTok, Instagram, and YouTube, revealing significant inconsistencies in data shared, including missing purpose, retention, and third-party disclosure. It assesses reliability using sock-puppet and real-user data, finding TikTok most complete and consistent while Instagram and YouTube show notable gaps. It also evaluates comprehensibility via a 400-person EU survey, uncovering tensions between conciseness and transparency and identifying Instagram as relatively more comprehensible. To address these issues, the authors propose a two-layer approach: AI-assisted re-representation of DDP content and a user-centric browser dashboard (Know Your Data) to customize transparency levels. The work highlights regulatory gaps and practical pathways for improving GDPR compliance and user understanding, with broader implications for standardization and enforcement.

Abstract

The GDPR's Right of Access aims to empower users with control over their personal data via Data Download Packages (DDPs). However, their effectiveness is often compromised by inconsistent platform implementations, questionable data reliability, and poor user comprehensibility. This paper conducts a comprehensive audit of DDPs from three social media platforms (TikTok, Instagram, and YouTube) to systematically assess these critical drawbacks. Despite offering similar services, we find that these platforms demonstrate significant inconsistencies in implementing the Right of Access, evident in varying levels of shared data. Critically, the failure to disclose processing purposes, retention periods, and other third-party data recipients serves as a further indicator of non-compliance. Our reliability evaluations, using bots and user-donated data, reveal that while TikTok's DDPs offer more consistent and complete data, others exhibit notable shortcomings. Similarly, our assessment of comprehensibility, based on surveys with 400 participants, indicates that current DDPs substantially fall short of GDPR's standards. To improve the comprehensibility, we propose and demonstrate a two-layered approach by: (1)~enhancing the data representation itself using stakeholder interpretations; and (2)~incorporating a user-friendly extension (\textit{Know Your Data}) for intuitive data visualization where users can control the level of transparency they prefer. Our findings underscore the need for clearer and non-conflicting regulatory guidance, stricter enforcement, and platform commitment to realize the goal of GDPR's Right of Access.

Setting the Course, but Forgetting to Steer: Analyzing Compliance with GDPR's Right of Access to Data by Instagram, TikTok, and YouTube

TL;DR

This paper conducts a systematic audit of GDPR Article 15 implementations for the Right of Access across TikTok, Instagram, and YouTube, revealing significant inconsistencies in data shared, including missing purpose, retention, and third-party disclosure. It assesses reliability using sock-puppet and real-user data, finding TikTok most complete and consistent while Instagram and YouTube show notable gaps. It also evaluates comprehensibility via a 400-person EU survey, uncovering tensions between conciseness and transparency and identifying Instagram as relatively more comprehensible. To address these issues, the authors propose a two-layer approach: AI-assisted re-representation of DDP content and a user-centric browser dashboard (Know Your Data) to customize transparency levels. The work highlights regulatory gaps and practical pathways for improving GDPR compliance and user understanding, with broader implications for standardization and enforcement.

Abstract

The GDPR's Right of Access aims to empower users with control over their personal data via Data Download Packages (DDPs). However, their effectiveness is often compromised by inconsistent platform implementations, questionable data reliability, and poor user comprehensibility. This paper conducts a comprehensive audit of DDPs from three social media platforms (TikTok, Instagram, and YouTube) to systematically assess these critical drawbacks. Despite offering similar services, we find that these platforms demonstrate significant inconsistencies in implementing the Right of Access, evident in varying levels of shared data. Critically, the failure to disclose processing purposes, retention periods, and other third-party data recipients serves as a further indicator of non-compliance. Our reliability evaluations, using bots and user-donated data, reveal that while TikTok's DDPs offer more consistent and complete data, others exhibit notable shortcomings. Similarly, our assessment of comprehensibility, based on surveys with 400 participants, indicates that current DDPs substantially fall short of GDPR's standards. To improve the comprehensibility, we propose and demonstrate a two-layered approach by: (1)~enhancing the data representation itself using stakeholder interpretations; and (2)~incorporating a user-friendly extension (\textit{Know Your Data}) for intuitive data visualization where users can control the level of transparency they prefer. Our findings underscore the need for clearer and non-conflicting regulatory guidance, stricter enforcement, and platform commitment to realize the goal of GDPR's Right of Access.

Paper Structure

This paper contains 28 sections, 12 figures, 8 tables.

Figures (12)

  • Figure 1: Pipeline to evaluate comprehensibility and reliability of the implementation of Article 15(3) of the GDPR.
  • Figure 2: Ratio of entries retained across multiple snapshots for the same account: Instagram (like history), TikTok (video browsing history and like activities), and YouTube (video browsing history).
  • Figure 3: Plots illustrating the CDF of data duration provided to users for two activities -- browse and search history. On Instagram, clusters in browsing history are observed at 6 and 13 days, while on TikTok, clusters appear at 180 and 455 days for both search and browse history.
  • Figure 4: Plots showing the percentage of users with different browsing history durations. For Instagram, users within the same country or region were either given one week or two weeks of data. TikTok, by contrast, exhibits greater variability in the EU whereas in the USA, users were provided with the same duration of data.
  • Figure 5: Figure representing how the watch history from the three platforms was displayed to users for comparison and to evaluate the four properties: conciseness, clear and plain language, intelligibility, and transparency.
  • ...and 7 more figures