Table of Contents
Fetching ...

I Know What You Bought Last Summer: Investigating User Data Leakage in E-Commerce Platforms

Ioannis Vlachogiannakis, Emmanouil Papadogiannakis, Panagiotis Papadopoulos, Nicolas Kourtellis, Evangelos Markatos

TL;DR

This paper addresses privacy risks in e-commerce by mapping how user data leaks to third parties and can be aggregated across multiple platforms. It uses a two-phase approach with a semi-automated Playwright-based crawler to collect network traffic and cookies from 200 e-shops, followed by leakage analysis that traces data flows to third parties and identifiers, including hashed forms. Key findings show that roughly 29–30% of sites leak at least one piece of sensitive information to external entities, with Meta/Facebook receiving substantial data and the potential to build comprehensive user profiles after engagement with as few as five shops. The work highlights the scale and mechanism of cross-site tracking, arguing for greater transparency and privacy protections, and provides public tools and datasets to advance further research in e-commerce privacy.

Abstract

In the digital age, e-commerce has transformed the way consumers shop, offering convenience and accessibility. Nevertheless, concerns about the privacy and security of personal information shared on these platforms have risen. In this work, we investigate user privacy violations, noting the risks of data leakage to third-party entities. Utilizing a semi-automated data collection approach, we examine a selection of popular online e-shops, revealing that nearly 30% of them violate user privacy by disclosing personal information to third parties. We unveil how minimal user interaction across multiple e-commerce websites can result in a comprehensive privacy breach. We observe significant data-sharing patterns with platforms like Facebook, which use personal information to build user profiles and link them to social media accounts.

I Know What You Bought Last Summer: Investigating User Data Leakage in E-Commerce Platforms

TL;DR

This paper addresses privacy risks in e-commerce by mapping how user data leaks to third parties and can be aggregated across multiple platforms. It uses a two-phase approach with a semi-automated Playwright-based crawler to collect network traffic and cookies from 200 e-shops, followed by leakage analysis that traces data flows to third parties and identifiers, including hashed forms. Key findings show that roughly 29–30% of sites leak at least one piece of sensitive information to external entities, with Meta/Facebook receiving substantial data and the potential to build comprehensive user profiles after engagement with as few as five shops. The work highlights the scale and mechanism of cross-site tracking, arguing for greater transparency and privacy protections, and provides public tools and datasets to advance further research in e-commerce privacy.

Abstract

In the digital age, e-commerce has transformed the way consumers shop, offering convenience and accessibility. Nevertheless, concerns about the privacy and security of personal information shared on these platforms have risen. In this work, we investigate user privacy violations, noting the risks of data leakage to third-party entities. Utilizing a semi-automated data collection approach, we examine a selection of popular online e-shops, revealing that nearly 30% of them violate user privacy by disclosing personal information to third parties. We unveil how minimal user interaction across multiple e-commerce websites can result in a comprehensive privacy breach. We observe significant data-sharing patterns with platforms like Facebook, which use personal information to build user profiles and link them to social media accounts.

Paper Structure

This paper contains 13 sections, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Overview of methodology for detecting personal information leakage.
  • Figure 2: Number of e-shops leaking sensitive personal information and number of third parties collecting this information from different e-commerce platforms.
  • Figure 3: Distribution of monthly visits (both desktop and mobile) of e-commerce websites.
  • Figure 4: Complete exposure of a user's personal information when visiting as few as 5 e-shop platforms.
  • Figure 5: Information flow of sensitive personal information that e-commerce platforms distribute to third-party entities. A greater flow weight indicates that a third party receives information from multiple online stores.