Towards Quantifying The Privacy Of Redacted Text
Vaibhav Gusain, Douglas Leith
TL;DR
This work tackles the problem of quantifying privacy for redacted text by leveraging transformer-based reconstructions (notably BART) to generate multiple plausible originals. It proposes a k-anonymity-inspired framework that uses top reconstructions, sentence embeddings, and a gibberish-detection/overlap quality proxy to gauge privacy, while evaluating privacy-attack success via a TFIDF+logistic regression classifier trained on available data. Across multiple datasets and redaction levels, the study shows a clear trade-off: higher redaction increases privacy but reduces utility, with a threshold region where reconstruction quality and attack effectiveness shift rapidly. The work contributes a practical methodology for estimating redacted-text privacy, highlights a potential path toward a clustering-based $k$-anonymity analogue, and suggests that redaction strategies should be tailored to the desired privacy guarantees and threat model.
Abstract
In this paper we propose use of a k-anonymity-like approach for evaluating the privacy of redacted text. Given a piece of redacted text we use a state of the art transformer-based deep learning network to reconstruct the original text. This generates multiple full texts that are consistent with the redacted text, i.e. which are grammatical, have the same non-redacted words etc, and represents each of these using an embedding vector that captures sentence similarity. In this way we can estimate the number, diversity and quality of full text consistent with the redacted text and so evaluate privacy.
