Table of Contents
Fetching ...

Why You've Got Mail: Evaluating Inbox Privacy Implications of Email Marketing Practices in Online Apps and Services

Scott Seidenberger, Oluwasijibomi Ajisegiri, Noah Pursell, Fazil Raja, Anindya Maiti

TL;DR

The paper investigates inbox privacy by auditing emails received after signing up for the top 150 online services and apps over 361 days. It introduces a scalable data-collection and classification framework that uses SPF/DKIM checks, ASN-based provenance, temporal and clustering analyses, and LLM-based content labeling to characterize email marketing practices. Key findings show no unknown third-party spam, pervasive internal and authorized third-party emails, and a Pareto concentration where a small subset of providers account for most volume, with distinct sector strategies in promotional vs CRM vs alert communications. The work offers actionable insights for policy and industry, demonstrating a scalable approach for ongoing monitoring of inbox privacy and the data-sharing practices behind email marketing.

Abstract

This study explores the widespread perception that personal data, such as email addresses, may be shared or sold without informed user consent, investigating whether these concerns are reflected in actual practices of popular online services and apps. Over the course of a year, we collected and analyzed the source, volume, frequency, and content of emails received by users after signing up for the 150 most popular online services and apps across various sectors. By examining patterns in email communications, we aim to identify consistent strategies used across industries, including potential signs of third-party data sharing. This analysis provides a critical evaluation of how email marketing tactics may intersect with data-sharing practices, with important implications for consumer privacy and regulatory oversight. Our study findings, conducted post-CCPA and GDPR, indicate that while no unknown third-party spam email was detected, internal and authorized third-party email marketing practices were pervasive, with companies frequently sending promotional and CRM emails despite opt-out preferences. The framework established in this work is designed to be scalable, allowing for continuous monitoring, and can be extended to include a more diverse set of apps and services for broader analysis, ultimately contributing to transparency in email address privacy practices.

Why You've Got Mail: Evaluating Inbox Privacy Implications of Email Marketing Practices in Online Apps and Services

TL;DR

The paper investigates inbox privacy by auditing emails received after signing up for the top 150 online services and apps over 361 days. It introduces a scalable data-collection and classification framework that uses SPF/DKIM checks, ASN-based provenance, temporal and clustering analyses, and LLM-based content labeling to characterize email marketing practices. Key findings show no unknown third-party spam, pervasive internal and authorized third-party emails, and a Pareto concentration where a small subset of providers account for most volume, with distinct sector strategies in promotional vs CRM vs alert communications. The work offers actionable insights for policy and industry, demonstrating a scalable approach for ongoing monitoring of inbox privacy and the data-sharing practices behind email marketing.

Abstract

This study explores the widespread perception that personal data, such as email addresses, may be shared or sold without informed user consent, investigating whether these concerns are reflected in actual practices of popular online services and apps. Over the course of a year, we collected and analyzed the source, volume, frequency, and content of emails received by users after signing up for the 150 most popular online services and apps across various sectors. By examining patterns in email communications, we aim to identify consistent strategies used across industries, including potential signs of third-party data sharing. This analysis provides a critical evaluation of how email marketing tactics may intersect with data-sharing practices, with important implications for consumer privacy and regulatory oversight. Our study findings, conducted post-CCPA and GDPR, indicate that while no unknown third-party spam email was detected, internal and authorized third-party email marketing practices were pervasive, with companies frequently sending promotional and CRM emails despite opt-out preferences. The framework established in this work is designed to be scalable, allowing for continuous monitoring, and can be extended to include a more diverse set of apps and services for broader analysis, ultimately contributing to transparency in email address privacy practices.

Paper Structure

This paper contains 16 sections, 7 figures, 1 table.

Figures (7)

  • Figure 1: Framework for email collection, processing, and classification to analyze inbox privacy across services.
  • Figure 2: Pareto diagram of total email volume.
  • Figure 3: Sankey diagram of the top 50 flows.
  • Figure 4: Hierarchical treemap of community reported spam on sender IPs, organized by ASN and associated company domains. Lighter shades indicate higher counts of spam reports, and vice versa.
  • Figure 5: Additive time series decomposition of aggregate emails received.
  • ...and 2 more figures