Exploring the Privacy Protection Capabilities of Chinese Large Language Models

Yuqi Yang; Xiaowen Huang; Jitao Sang

Exploring the Privacy Protection Capabilities of Chinese Large Language Models

Yuqi Yang, Xiaowen Huang, Jitao Sang

TL;DR

The paper addresses privacy protection in Chinese large language models by introducing a three-tiered privacy evaluation framework that examines general privacy knowledge, contextual privacy, and privacy under attack prompts. It applies this framework to four 6–7B Chinese LLMs with aligned chat capabilities, using zero-shot and few-shot prompts and attack templates to probe safeguards. The findings reveal pervasive privacy shortcomings across models, with performance deteriorating under few-shot settings and attack prompts, highlighting real-world privacy risks in deploying LLM-based services. The work emphasizes the need for stronger privacy-security alignment, robust defense strategies, and more realistic evaluation data, while acknowledging limitations such as synthetic datasets and the monotony of some prompt attacks.

Abstract

Large language models (LLMs), renowned for their impressive capabilities in various tasks, have significantly advanced artificial intelligence. Yet, these advancements have raised growing concerns about privacy and security implications. To address these issues and explain the risks inherent in these models, we have devised a three-tiered progressive framework tailored for evaluating privacy in language systems. This framework consists of progressively complex and in-depth privacy test tasks at each tier. Our primary objective is to comprehensively evaluate the sensitivity of large language models to private information, examining how effectively they discern, manage, and safeguard sensitive data in diverse scenarios. This systematic evaluation helps us understand the degree to which these models comply with privacy protection guidelines and the effectiveness of their inherent safeguards against privacy breaches. Our observations indicate that existing Chinese large language models universally show privacy protection shortcomings. It seems that at the moment this widespread issue is unavoidable and may pose corresponding privacy risks in applications based on these models.

Exploring the Privacy Protection Capabilities of Chinese Large Language Models

TL;DR

Abstract

Paper Structure (21 sections, 8 figures, 8 tables)

This paper contains 21 sections, 8 figures, 8 tables.

Introduction
Related Work
Privacy for Language Models
Prompt Attacks for LLMs
Three-tiered Evaluation Method
General Privacy Information Evaluation
Contextual Privacy Evaluation
Privacy Evaluation Under Attacks
Experiments
Settings
Results
Result of general privacy information evaluation
Result of contextual privacy evaluation
Result of privacy evaluation under attacks
Conclusion and Discussion
...and 6 more sections

Figures (8)

Figure 1: A brief overview of the three-tiered privacy evaluation structure used in this work, where yellow background text represents the content of the prompt provided to the model, green represents privacy-secure compliant responses, and red represents responses that do not comply with the privacy constraints or malicious prompt content of the attack.
Figure 2: Summarize the test performance of the four models for all tasks. For metrics under attack scenarios, take the average as the final representation of performance.
Figure 3: Prompt template for testing the performance of a specific model in the context of privacy scenarios under the response generation task.
Figure 4: Prompt template for testing the performance of a specific model in a privacy scenario setting under the choice questions task.
Figure 5: System prompt for medical dialogue task. Sensitive information in examples is represented by 'XXXX'.
...and 3 more figures

Exploring the Privacy Protection Capabilities of Chinese Large Language Models

TL;DR

Abstract

Exploring the Privacy Protection Capabilities of Chinese Large Language Models

Authors

TL;DR

Abstract

Table of Contents

Figures (8)