Why am I seeing this: Democratizing End User Auditing for Online Content Recommendations
Chaoran Chen, Leyang Li, Luke Cao, Yanfang Ye, Tianshi Li, Yaxing Yao, Toby Jia-jun Li
TL;DR
Personalized recommendations rely on private user data, yet users struggle to verify how attributes influence the outcomes. The authors introduce a Privacy Auditing Sandbox that uses LLM-generated personas and controlled attribute variation to test causal links between user characteristics and online content, demonstrated in a targeted-ad case study. Technical evaluations show strong persona quality, high ad-identification accuracy, and stable ad-rating scores, while the user study confirms usability and perceived empowerment in privacy auditing. The approach advances end-user agency and privacy literacy and offers a pathway to broader applicability in auditing algorithmic accountability across privacy-sensitive domains.
Abstract
Personalized recommendation systems tailor content based on user attributes, which are either provided or inferred from private data. Research suggests that users often hypothesize about reasons behind contents they encounter (e.g., "I see this jewelry ad because I am a woman"), but they lack the means to confirm these hypotheses due to the opaqueness of these systems. This hinders informed decision-making about privacy and system use and contributes to the lack of algorithmic accountability. To address these challenges, we introduce a new interactive sandbox approach. This approach creates sets of synthetic user personas and corresponding personal data that embody realistic variations in personal attributes, allowing users to test their hypotheses by observing how a website's algorithms respond to these personas. We tested the sandbox in the context of targeted advertisement. Our user study demonstrates its usability, usefulness, and effectiveness in empowering end-user auditing in a case study of targeting ads.
