Table of Contents
Fetching ...

Towards Operationalizing Right to Data Protection

Abhinav Java, Simra Shahid, Chirag Agarwal

TL;DR

RegText is introduced, a framework that injects imperceptible spurious correlations into natural language datasets, effectively rendering them unlearnable without affecting semantic content, and demonstrates RegText's utility through rigorous empirical analysis of small and large LMs.

Abstract

The widespread practice of indiscriminate data scraping to fine-tune language models (LMs) raises significant legal and ethical concerns, particularly regarding compliance with data protection laws such as the General Data Protection Regulation (GDPR). This practice often results in the unauthorized use of personal information, prompting growing debate within the academic and regulatory communities. Recent works have introduced the concept of generating unlearnable datasets (by adding imperceptible noise to the clean data), such that the underlying model achieves lower loss during training but fails to generalize to the unseen test setting. Though somewhat effective, these approaches are predominantly designed for images and are limited by several practical constraints like requiring knowledge of the target model. To this end, we introduce RegText, a framework that injects imperceptible spurious correlations into natural language datasets, effectively rendering them unlearnable without affecting semantic content. We demonstrate RegText's utility through rigorous empirical analysis of small and large LMs. Notably, RegText can restrict newer models like GPT-4o and Llama from learning on our generated data, resulting in a drop in their test accuracy compared to their zero-shot performance and paving the way for generating unlearnable text to protect public data.

Towards Operationalizing Right to Data Protection

TL;DR

RegText is introduced, a framework that injects imperceptible spurious correlations into natural language datasets, effectively rendering them unlearnable without affecting semantic content, and demonstrates RegText's utility through rigorous empirical analysis of small and large LMs.

Abstract

The widespread practice of indiscriminate data scraping to fine-tune language models (LMs) raises significant legal and ethical concerns, particularly regarding compliance with data protection laws such as the General Data Protection Regulation (GDPR). This practice often results in the unauthorized use of personal information, prompting growing debate within the academic and regulatory communities. Recent works have introduced the concept of generating unlearnable datasets (by adding imperceptible noise to the clean data), such that the underlying model achieves lower loss during training but fails to generalize to the unseen test setting. Though somewhat effective, these approaches are predominantly designed for images and are limited by several practical constraints like requiring knowledge of the target model. To this end, we introduce RegText, a framework that injects imperceptible spurious correlations into natural language datasets, effectively rendering them unlearnable without affecting semantic content. We demonstrate RegText's utility through rigorous empirical analysis of small and large LMs. Notably, RegText can restrict newer models like GPT-4o and Llama from learning on our generated data, resulting in a drop in their test accuracy compared to their zero-shot performance and paving the way for generating unlearnable text to protect public data.

Paper Structure

This paper contains 12 sections, 2 equations, 4 figures, 5 tables, 1 algorithm.

Figures (4)

  • Figure 1: RegText Data Pipeline. Unlearnable data is generated from clean data in a model-agnostic manner by adding spurious perturbations like mozilla to clean instances. The figure shows that 'unlearnable' data lead to high training accuracy of the LM but fail to generalize to clean test data, successfully fooling the LM.
  • Figure 2: Empirical evidence to show the inverse behavior of function $\phi$w.r.t. the token frequency, where the aggregated gradient value decreases as the token frequency increases.
  • Figure 3: Fine-tuning loss. The fine-tuning loss curves of GPT-4o-mini model when trained on Clean and RegText (a) IMDb, (b) AGNews, and (c) Polarity datasets. While models like GPT-4o-mini achieve high benchmark performances on several datasets, we observe that even they can converge better and faster on RegText data, showing no obvious abnormality during training.
  • Figure 4: Ablation studies. Performance of RegText across different (a) rank of Q-LoRA adapters during fine-tuning, (b) minimum number of words in an example for noise to be added $w_{min}$, (c) number of unique noises ($N_{w}$), and maximum perturbations in one examples $w_{max}$. On average, across all ablations, we observe that RegText limits the model from learning new information during fine-tuning (exact match is always lower than zero-shot performance).