Rescriber: Smaller-LLM-Powered User-Led Data Minimization for LLM-Based Chatbots
Jijie Zhou, Eryue Xu, Yaoyao Wu, Tianshi Li
TL;DR
Rescriber tackles the challenge of privacy in LLM-based chatbots by enabling user-led data minimization through an on-device, browser-based extension. It features two back-ends—a smaller on-device Llama3-8B model and a cloud-based GPT-4o—providing real-time PII detection, redaction, and abstraction with write-back to preserve utility. In a mixed-methods study with 12 participants, Rescriber reduced unnecessary disclosure and improved perceived privacy protection, with detection completeness and consistent sanitization identified as key trust factors. The work demonstrates the feasibility of smaller-LLM-powered, user-facing privacy controls as a practical, trust-enhancing approach to privacy in AI-assisted conversations.
Abstract
The proliferation of LLM-based conversational agents has resulted in excessive disclosure of identifiable or sensitive information. However, existing technologies fail to offer perceptible control or account for users' personal preferences about privacy-utility tradeoffs due to the lack of user involvement. To bridge this gap, we designed, built, and evaluated Rescriber, a browser extension that supports user-led data minimization in LLM-based conversational agents by helping users detect and sanitize personal information in their prompts. Our studies (N=12) showed that Rescriber helped users reduce unnecessary disclosure and addressed their privacy concerns. Users' subjective perceptions of the system powered by Llama3-8B were on par with that by GPT-4o. The comprehensiveness and consistency of the detection and sanitization emerge as essential factors that affect users' trust and perceived protection. Our findings confirm the viability of smaller-LLM-powered, user-facing, on-device privacy controls, presenting a promising approach to address the privacy and trust challenges of AI.
