ConsentDiff at Scale: Longitudinal Audits of Web Privacy Policy Changes and UI Frictions

Haoze Guo

ConsentDiff at Scale: Longitudinal Audits of Web Privacy Policy Changes and UI Frictions

Haoze Guo

TL;DR

ConsentDiff provides a longitudinal, web-scale framework to jointly audit privacy policy text and consent UI patterns, enabling clause-level churn analysis and a formal claim--UI alignment score $A$ across regions and verticals. By monthly snapshotting, semantically aligning policy clauses, and classifying UI patterns from DOM and screenshots, it links textual claims (e.g., opt-in, easy reject) to observable UI predicates and detects shifts over time. The study reveals persistent policy churn, systematic banner changes to reduce friction, and higher alignment where rejecting is visibly supported and steps to reject are few, with regional and vertical differences evident. The work delivers a reproducible methodology and releases data/code to support regulatory benchmarking and CMP improvement, while outlining avenues for larger multimodal classifiers and causal analyses of enforcement effects.

Abstract

Web privacy is experienced via two public artifacts: site utterances in policy texts, and the actions users are required to take during consent interfaces. In the extensive cross-section audits we've studied, there is a lack of longitudinal data detailing how these artifacts are changing together, and if interfaces are actually doing what they promise in policy. ConsentDiff provides that longitudinal view. We build a reproducible pipeline that snapshots sites every month, semantically aligns policy clauses to track clause-level churn, and classifies consent-UI patterns by pulling together DOM signals with cues provided by screenshots. We introduce a novel weighted claim-UI alignment score, connecting common policy claims to observable predicates, and enabling comparisons over time, regions, and verticals. Our measurements suggest continued policy churn, systematic changes to eliminate a higher-friction banner design, and significantly higher alignment where rejecting is visible and lower friction.

ConsentDiff at Scale: Longitudinal Audits of Web Privacy Policy Changes and UI Frictions

TL;DR

ConsentDiff provides a longitudinal, web-scale framework to jointly audit privacy policy text and consent UI patterns, enabling clause-level churn analysis and a formal claim--UI alignment score

across regions and verticals. By monthly snapshotting, semantically aligning policy clauses, and classifying UI patterns from DOM and screenshots, it links textual claims (e.g., opt-in, easy reject) to observable UI predicates and detects shifts over time. The study reveals persistent policy churn, systematic banner changes to reduce friction, and higher alignment where rejecting is visibly supported and steps to reject are few, with regional and vertical differences evident. The work delivers a reproducible methodology and releases data/code to support regulatory benchmarking and CMP improvement, while outlining avenues for larger multimodal classifiers and causal analyses of enforcement effects.

ConsentDiff at Scale: Longitudinal Audits of Web Privacy Policy Changes and UI Frictions

TL;DR

Abstract

ConsentDiff at Scale: Longitudinal Audits of Web Privacy Policy Changes and UI Frictions

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (1)