Table of Contents
Fetching ...

Evidencing Unauthorized Training Data from AI Generated Content using Information Isotopes

Qi Tao, Yin Jinhua, Cai Dongqi, Xie Yueqi, Wang Huili, Hu Zhiyang, Yang Peiru, Nan Guoshun, Zhou Zhili, Wang Shangguang, Lyu Lingjuan, Huang Yongfeng, Lane Nicholas

TL;DR

The concept of information isotopes is introduced and their properties in tracing training data within opaque AI systems are elucidated to show the potential of the work as an inclusive tool for empowering individuals, including those without expertise in AI, to safeguard their data rights in the rapidly evolving era of AI advancements and applications.

Abstract

In light of scaling laws, many AI institutions are intensifying efforts to construct advanced AIs on extensive collections of high-quality human data. However, in a rush to stay competitive, some institutions may inadvertently or even deliberately include unauthorized data (like privacy- or intellectual property-sensitive content) for AI training, which infringes on the rights of data owners. Compounding this issue, these advanced AI services are typically built on opaque cloud platforms, which restricts access to internal information during AI training and inference, leaving only the generated outputs available for forensics. Thus, despite the introduction of legal frameworks by various countries to safeguard data rights, uncovering evidence of data misuse in modern opaque AI applications remains a significant challenge. In this paper, inspired by the ability of isotopes to trace elements within chemical reactions, we introduce the concept of information isotopes and elucidate their properties in tracing training data within opaque AI systems. Furthermore, we propose an information isotope tracing method designed to identify and provide evidence of unauthorized data usage by detecting the presence of target information isotopes in AI generations. We conduct experiments on ten AI models (including GPT-4o, Claude-3.5, and DeepSeek) and four benchmark datasets in critical domains (medical data, copyrighted books, and news). Results show that our method can distinguish training datasets from non-training datasets with 99\% accuracy and significant evidence (p-value$<0.001$) by examining a data entry equivalent in length to a research paper. The findings show the potential of our work as an inclusive tool for empowering individuals, including those without expertise in AI, to safeguard their data rights in the rapidly evolving era of AI advancements and applications.

Evidencing Unauthorized Training Data from AI Generated Content using Information Isotopes

TL;DR

The concept of information isotopes is introduced and their properties in tracing training data within opaque AI systems are elucidated to show the potential of the work as an inclusive tool for empowering individuals, including those without expertise in AI, to safeguard their data rights in the rapidly evolving era of AI advancements and applications.

Abstract

In light of scaling laws, many AI institutions are intensifying efforts to construct advanced AIs on extensive collections of high-quality human data. However, in a rush to stay competitive, some institutions may inadvertently or even deliberately include unauthorized data (like privacy- or intellectual property-sensitive content) for AI training, which infringes on the rights of data owners. Compounding this issue, these advanced AI services are typically built on opaque cloud platforms, which restricts access to internal information during AI training and inference, leaving only the generated outputs available for forensics. Thus, despite the introduction of legal frameworks by various countries to safeguard data rights, uncovering evidence of data misuse in modern opaque AI applications remains a significant challenge. In this paper, inspired by the ability of isotopes to trace elements within chemical reactions, we introduce the concept of information isotopes and elucidate their properties in tracing training data within opaque AI systems. Furthermore, we propose an information isotope tracing method designed to identify and provide evidence of unauthorized data usage by detecting the presence of target information isotopes in AI generations. We conduct experiments on ten AI models (including GPT-4o, Claude-3.5, and DeepSeek) and four benchmark datasets in critical domains (medical data, copyrighted books, and news). Results show that our method can distinguish training datasets from non-training datasets with 99\% accuracy and significant evidence (p-value) by examining a data entry equivalent in length to a research paper. The findings show the potential of our work as an inclusive tool for empowering individuals, including those without expertise in AI, to safeguard their data rights in the rapidly evolving era of AI advancements and applications.

Paper Structure

This paper contains 14 sections, 3 theorems, 3 equations, 8 figures.

Key Result

Lemma 1

The statistical significance $p$-value of detecting the usage of dataset $\mathcal{D}$ for AI training is given by:

Figures (8)

  • Figure 1: The unauthorized data usage issue in AI training and corresponding detection methodologies. A SOTA AI systems and their disclosure of training data details. The diameter of the circles represents the extent of disclosed information regarding training data sources. Results indicate that recent AI systems (those outside the gray box) often achieve superior performance, however, most refrain from disclosing the specific sources of their training data. This lack of transparency complicates efforts to verify unauthorized data usage in AI training. B The workflow of existing methods for detecting AI training data. C A case study on detecting training data through AI-generated content. When presented with a segment of training data, the continuation generated by the AI often exhibits low similarity with the original human-authored continuation due to the advanced AI optimization algorithms. This shows the difficulty in providing conclusive evidence of training data usage through AI generations. D An illustration of the traceability property of information isotopes within opaque AI systems.
  • Figure 2: Properties of information isotopes. A Traceability distribution of information isotopes. This evaluation is based on the averaged results by querying six opaque AI models (including GPT-3.5, GPT-4o, Claude-3.5, Gemini-1.5, GLM-4-Air, and DeepSeek-V2.5) with copyrighted news articles. The results show that information isotopes within the training dataset can be effectively traced within AI-generated content, exhibiting a traceability probability significantly higher than that observed in non-training data. This finding suggests that information isotopes can provide rich informativeness for distinguishing between training and non-training data based on AI-generated content. B Similarity distribution between AI-generated continuation and original training data. This analysis examines the similarity between the continuations generated by AI and the original training data content. These results reveal that the similarity distribution between the continuations from training data and non-training data is statistically insignificant. This observation highlights the inherent difficulties in identifying training data based solely on the similarity of AI-generated continuations to original training data.
  • Figure 3: Detection performance on individual data entries. The evaluation was based on identifying the use of paragraphs (no more than 256 words) from news articles within six advanced AI systems. The results indicate that while baseline methods deteriorated to the level of random guessing, our method was able to effectively identify the training data, consistently and significantly outperforming the baseline approaches. These findings demonstrate the efficacy and superiority of our proposed information isotope tracing method in identifying training data within opaque AI systems.
  • Figure 4: Detection performance under varying suspected data sizes. Results indicate that baselines remain ineffective in identifying training data, even when more data entries are examined. Instead, the performance of our method improves with increasing sizes of examined samples, and the detection accuracy of our method exceeds 99% when only 40 data are provided.
  • Figure 5: The statistical significance of training data detection. Results indicate that baseline detection methods exhibit negligible significance when 50 data entries are analyzed. In contrast, the detection significance of our method escalates promptly as the quantity of available data increases, illustrating its capacity to provide significant and robust evidence supporting the identifying of unauthorized data usage in AI training.
  • ...and 3 more figures

Theorems & Definitions (3)

  • Lemma 1
  • Lemma 2
  • Lemma 3