Table of Contents
Fetching ...

Safe Training with Sensitive In-domain Data: Leveraging Data Fragmentation To Mitigate Linkage Attacks

Mariia Ignashina, Julia Ive

TL;DR

This paper tackles privacy risks in training NLP models on sensitive in-domain data by proposing data fragmentation into meaningful syntactic chunks to mitigate linkage attacks. It introduces a methodology that extracts NP and VP constituents and forms training examples by concatenating multiple chunks from different sources, ensuring fragments do not reveal full original texts. The authors benchmark this approach against a differential privacy baseline (DP-Rewrite) by fine-tuning GPT-2 and BERT on MIMIC-III cardiovascular data for language modelling and diagnosis prediction, showing that fragmented data preserves utility close to full data while outperforming DP-rewritten data. The work highlights the practical potential of privacy-preserving fragmented data for domain adaptation, though it notes the absence of formal privacy guarantees and suggests avenues for rigorous privacy metrics and clinical validation in future work.

Abstract

Current text generation models are trained using real data which can potentially contain sensitive information, such as confidential patient information and the like. Under certain conditions output of the training data which they have memorised can be triggered, exposing sensitive data. To mitigate against this risk we propose a safer alternative which sees fragmented data in the form of domain-specific short phrases randomly grouped together shared instead of full texts. Thus, text fragments that could re-identify an individual cannot be reproduced by the model in one sequence, giving significant protection against linkage attacks. We fine-tune several state-of-the-art LLMs using meaningful syntactic chunks to explore their utility. In particular, we fine-tune BERT-based models to predict two cardiovascular diagnoses. Our results demonstrate the capacity of LLMs to benefit from the pre-trained knowledge and deliver classification results when fine-tuned with fragmented data comparable to fine-tuning with full training data.

Safe Training with Sensitive In-domain Data: Leveraging Data Fragmentation To Mitigate Linkage Attacks

TL;DR

This paper tackles privacy risks in training NLP models on sensitive in-domain data by proposing data fragmentation into meaningful syntactic chunks to mitigate linkage attacks. It introduces a methodology that extracts NP and VP constituents and forms training examples by concatenating multiple chunks from different sources, ensuring fragments do not reveal full original texts. The authors benchmark this approach against a differential privacy baseline (DP-Rewrite) by fine-tuning GPT-2 and BERT on MIMIC-III cardiovascular data for language modelling and diagnosis prediction, showing that fragmented data preserves utility close to full data while outperforming DP-rewritten data. The work highlights the practical potential of privacy-preserving fragmented data for domain adaptation, though it notes the absence of formal privacy guarantees and suggests avenues for rigorous privacy metrics and clinical validation in future work.

Abstract

Current text generation models are trained using real data which can potentially contain sensitive information, such as confidential patient information and the like. Under certain conditions output of the training data which they have memorised can be triggered, exposing sensitive data. To mitigate against this risk we propose a safer alternative which sees fragmented data in the form of domain-specific short phrases randomly grouped together shared instead of full texts. Thus, text fragments that could re-identify an individual cannot be reproduced by the model in one sequence, giving significant protection against linkage attacks. We fine-tune several state-of-the-art LLMs using meaningful syntactic chunks to explore their utility. In particular, we fine-tune BERT-based models to predict two cardiovascular diagnoses. Our results demonstrate the capacity of LLMs to benefit from the pre-trained knowledge and deliver classification results when fine-tuned with fragmented data comparable to fine-tuning with full training data.
Paper Structure (12 sections, 4 tables)