Table of Contents
Fetching ...

Semantically-Aware LLM Agent to Enhance Privacy in Conversational AI Services

Jayden Serenari, Stephen Lee

TL;DR

The paper tackles privacy leaks in conversational AI by introducing LOPSIDED, a semantically-aware privacy agent that locally pseudonymizes sensitive PII before sending prompts to remote LLMs and then de-pseudonymizes the model's output. The approach combines semantic-aware pseudonymization and a replacement-based entity substitution to maintain accuracy while protecting privacy, trained on GPT-4o-generated data and evaluated on ShareGPT-derived prompts. It demonstrates a fivefold reduction in semantic-utility errors compared to baselines, while maintaining competitive privacy protection, and provides a dedicated, human-annotated dataset for evaluating privacy-preserving NL systems. The work contributes a practical on-device privacy framework, a semantically rich evaluation dataset, and a benchmark for balancing privacy with utility in LLM-based conversational services, with broad implications for secure deployment of AI assistants in privacy-sensitive settings.

Abstract

With the increasing use of conversational AI systems, there is growing concern over privacy leaks, especially when users share sensitive personal data in interactions with Large Language Models (LLMs). Conversations shared with these models may contain Personally Identifiable Information (PII), which, if exposed, could lead to security breaches or identity theft. To address this challenge, we present the Local Optimizations for Pseudonymization with Semantic Integrity Directed Entity Detection (LOPSIDED) framework, a semantically-aware privacy agent designed to safeguard sensitive PII data when using remote LLMs. Unlike prior work that often degrade response quality, our approach dynamically replaces sensitive PII entities in user prompts with semantically consistent pseudonyms, preserving the contextual integrity of conversations. Once the model generates its response, the pseudonyms are automatically depseudonymized, ensuring the user receives an accurate, privacy-preserving output. We evaluate our approach using real-world conversations sourced from ShareGPT, which we further augment and annotate to assess whether named entities are contextually relevant to the model's response. Our results show that LOPSIDED reduces semantic utility errors by a factor of 5 compared to baseline techniques, all while enhancing privacy.

Semantically-Aware LLM Agent to Enhance Privacy in Conversational AI Services

TL;DR

The paper tackles privacy leaks in conversational AI by introducing LOPSIDED, a semantically-aware privacy agent that locally pseudonymizes sensitive PII before sending prompts to remote LLMs and then de-pseudonymizes the model's output. The approach combines semantic-aware pseudonymization and a replacement-based entity substitution to maintain accuracy while protecting privacy, trained on GPT-4o-generated data and evaluated on ShareGPT-derived prompts. It demonstrates a fivefold reduction in semantic-utility errors compared to baselines, while maintaining competitive privacy protection, and provides a dedicated, human-annotated dataset for evaluating privacy-preserving NL systems. The work contributes a practical on-device privacy framework, a semantically rich evaluation dataset, and a benchmark for balancing privacy with utility in LLM-based conversational services, with broad implications for secure deployment of AI assistants in privacy-sensitive settings.

Abstract

With the increasing use of conversational AI systems, there is growing concern over privacy leaks, especially when users share sensitive personal data in interactions with Large Language Models (LLMs). Conversations shared with these models may contain Personally Identifiable Information (PII), which, if exposed, could lead to security breaches or identity theft. To address this challenge, we present the Local Optimizations for Pseudonymization with Semantic Integrity Directed Entity Detection (LOPSIDED) framework, a semantically-aware privacy agent designed to safeguard sensitive PII data when using remote LLMs. Unlike prior work that often degrade response quality, our approach dynamically replaces sensitive PII entities in user prompts with semantically consistent pseudonyms, preserving the contextual integrity of conversations. Once the model generates its response, the pseudonyms are automatically depseudonymized, ensuring the user receives an accurate, privacy-preserving output. We evaluate our approach using real-world conversations sourced from ShareGPT, which we further augment and annotate to assess whether named entities are contextually relevant to the model's response. Our results show that LOPSIDED reduces semantic utility errors by a factor of 5 compared to baseline techniques, all while enhancing privacy.

Paper Structure

This paper contains 27 sections, 2 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: The LOPSIDED privacy agent system design.
  • Figure 2: Overall workflow of LOPSIDED framework.
  • Figure 3: GPT-4o pseudonymization output.
  • Figure 4: Privacy and utility error rate comparisons.