Table of Contents
Fetching ...

An Empathy-Based Sandbox Approach to Bridge the Privacy Gap among Attitudes, Goals, Knowledge, and Behaviors

Chaoran Chen, Weijun Li, Wenxin Song, Yanfang Ye, Yaxing Yao, Toby Jia-jun Li

TL;DR

This paper tackles the persistent privacy attitude-behavior gap by introducing an empathy-based sandbox that uses artificially generated personas with synthesized, longitudinal data. It combines top-down privacy literacy with experiential, bottom-up learning by letting users interact with online services as these personas, observing how privacy attributes influence system outcomes like targeted ads. A novel generation pipeline augments LLM outputs via few-shot learning, contextualization, and chain-of-thought reasoning to create realistic personas and data, then replaces real user data in a controlled environment. A proof-of-concept Privacy Sandbox and a 15-participant study demonstrate cognitive and emotional empathy toward personas, observe links between persona privacy attributes and system outcomes, and offer design implications for privacy education and literacy. The work highlights both the potential for experiential privacy learning and the need to address LLM biases, ethical considerations, and generalizability to broader downstream tasks.

Abstract

Managing privacy to reach privacy goals is challenging, as evidenced by the privacy attitude-behavior gap. Mitigating this discrepancy requires solutions that account for both system opaqueness and users' hesitations in testing different privacy settings due to fears of unintended data exposure. We introduce an empathy-based approach that allows users to experience how privacy attributes may alter system outcomes in a risk-free sandbox environment from the perspective of artificially generated personas. To generate realistic personas, we introduce a novel pipeline that augments the outputs of large language models (e.g., GPT-4) using few-shot learning, contextualization, and chain of thoughts. Our empirical studies demonstrated the adequate quality of generated personas and highlighted the changes in privacy-related applications (e.g., online advertising) caused by different personas. Furthermore, users demonstrated cognitive and emotional empathy towards the personas when interacting with our sandbox. We offered design implications for downstream applications in improving user privacy literacy.

An Empathy-Based Sandbox Approach to Bridge the Privacy Gap among Attitudes, Goals, Knowledge, and Behaviors

TL;DR

This paper tackles the persistent privacy attitude-behavior gap by introducing an empathy-based sandbox that uses artificially generated personas with synthesized, longitudinal data. It combines top-down privacy literacy with experiential, bottom-up learning by letting users interact with online services as these personas, observing how privacy attributes influence system outcomes like targeted ads. A novel generation pipeline augments LLM outputs via few-shot learning, contextualization, and chain-of-thought reasoning to create realistic personas and data, then replaces real user data in a controlled environment. A proof-of-concept Privacy Sandbox and a 15-participant study demonstrate cognitive and emotional empathy toward personas, observe links between persona privacy attributes and system outcomes, and offer design implications for privacy education and literacy. The work highlights both the potential for experiential privacy learning and the need to address LLM biases, ethical considerations, and generalizability to broader downstream tasks.

Abstract

Managing privacy to reach privacy goals is challenging, as evidenced by the privacy attitude-behavior gap. Mitigating this discrepancy requires solutions that account for both system opaqueness and users' hesitations in testing different privacy settings due to fears of unintended data exposure. We introduce an empathy-based approach that allows users to experience how privacy attributes may alter system outcomes in a risk-free sandbox environment from the perspective of artificially generated personas. To generate realistic personas, we introduce a novel pipeline that augments the outputs of large language models (e.g., GPT-4) using few-shot learning, contextualization, and chain of thoughts. Our empirical studies demonstrated the adequate quality of generated personas and highlighted the changes in privacy-related applications (e.g., online advertising) caused by different personas. Furthermore, users demonstrated cognitive and emotional empathy towards the personas when interacting with our sandbox. We offered design implications for downstream applications in improving user privacy literacy.
Paper Structure (57 sections, 1 equation, 11 figures, 4 tables)

This paper contains 57 sections, 1 equation, 11 figures, 4 tables.

Figures (11)

  • Figure 1: An empathy-based approach where users interact with online services with different personas in a risk-free sandbox without leaking their real personal data. Users can observe and experience the causal effect between their privacy configurations/behaviors and system outcomes, acquire privacy knowledge, and translate the knowledge into actual behavior. Note that this is a general framework. We leave the studies of changes in privacy behaviors as future work.
  • Figure 2: An empathy-based approach where users interact with online services by using the identity of different personas in a risk-free sandbox without leaking their real personal data. Users can cognitively and emotionally empathize with personas, observe and experience the causal effect between the privacy data and system outcomes (e.g., target ads), and acquire privacy knowledge.
  • Figure 3: The generation pipeline of profile portrait images.
  • Figure 4: Privacy Sandbox User Journey. (a) Providing guidance for Persona's Profile Generation: The User's initial input acts as a seed for persona creation, exemplified by Bob's specific professional and personal interests. (b) Initial Persona Profile Generation and Customization: Creation of a preliminary persona "Carlos Rodriguez", which users can review and modify. (c) Generating additional privacy data aligned with the profile: Extension of the persona's attributes, ensuring alignment with the initial profile.
  • Figure 6: Means and standard errors of each measure in three conditions: GPT-generated persona, our generated persona, and real persona. All items are measured by user ratings on a 5-point Likert scale.
  • ...and 6 more figures