Table of Contents
Fetching ...

Probabilistic Reasoning with LLMs for k-anonymity Estimation

Jonathan Zheng, Sauvik Das, Alan Ritter, Wei Xu

TL;DR

BRANCH presents a novel probabilistic reasoning framework for LLMs to estimate privacy risk in user-generated text by predicting the k-anonymity of disclosed attributes. By implicitly constructing a Bayesian network over disclosures and estimating conditional probabilities with LLMs, BRANCH reconstructs the joint distribution to compute $\,\hat{k}=n\cdot p$. Empirical evaluation on a human-annotated Reddit/ShareGPT dataset shows BRANCH outperforms Chain-of-Thought baselines, particularly on complex posts with many attributes, and uncertainty signals effectively flag lower-confidence estimates. The work advances both probabilistic reasoning in LLMs and practical privacy risk assessment, offering a foundation for user-facing privacy tools that quantify identification risk in online disclosures.

Abstract

Probabilistic reasoning is a key aspect of both human and artificial intelligence that allows for handling uncertainty and ambiguity in decision-making. In this paper, we introduce a new numerical reasoning task under uncertainty for large language models, focusing on estimating the privacy risk of user-generated documents containing privacy-sensitive information. We propose BRANCH, a new LLM methodology that estimates the k-privacy value of a text-the size of the population matching the given information. BRANCH factorizes a joint probability distribution of personal information as random variables. The probability of each factor in a population is estimated separately using a Bayesian network and combined to compute the final k-value. Our experiments show that this method successfully estimates the k-value 73% of the time, a 13% increase compared to o3-mini with chain-of-thought reasoning. We also find that LLM uncertainty is a good indicator for accuracy, as high-variance predictions are 37.47% less accurate on average.

Probabilistic Reasoning with LLMs for k-anonymity Estimation

TL;DR

BRANCH presents a novel probabilistic reasoning framework for LLMs to estimate privacy risk in user-generated text by predicting the k-anonymity of disclosed attributes. By implicitly constructing a Bayesian network over disclosures and estimating conditional probabilities with LLMs, BRANCH reconstructs the joint distribution to compute . Empirical evaluation on a human-annotated Reddit/ShareGPT dataset shows BRANCH outperforms Chain-of-Thought baselines, particularly on complex posts with many attributes, and uncertainty signals effectively flag lower-confidence estimates. The work advances both probabilistic reasoning in LLMs and practical privacy risk assessment, offering a foundation for user-facing privacy tools that quantify identification risk in online disclosures.

Abstract

Probabilistic reasoning is a key aspect of both human and artificial intelligence that allows for handling uncertainty and ambiguity in decision-making. In this paper, we introduce a new numerical reasoning task under uncertainty for large language models, focusing on estimating the privacy risk of user-generated documents containing privacy-sensitive information. We propose BRANCH, a new LLM methodology that estimates the k-privacy value of a text-the size of the population matching the given information. BRANCH factorizes a joint probability distribution of personal information as random variables. The probability of each factor in a population is estimated separately using a Bayesian network and combined to compute the final k-value. Our experiments show that this method successfully estimates the k-value 73% of the time, a 13% increase compared to o3-mini with chain-of-thought reasoning. We also find that LLM uncertainty is a good indicator for accuracy, as high-variance predictions are 37.47% less accurate on average.

Paper Structure

This paper contains 30 sections, 3 equations, 10 figures, 10 tables.

Figures (10)

  • Figure 1: Most common Chain-of-Thought reasoning error types (with occurrence rates) and examples for o3-mini on Privacy Risk Estimation. Errors and correct explanations are highlighted. Chain-of-Thought struggles to model PII and capture relationships between attributes for risk assessments.
  • Figure 2: Illustration of the Branch framework for estimating the privacy risk $k$ of a document, representing the number of people worldwide who share the personal attributes in the text. LLMs output a single Bayesian Network from the space of possible models for the joint distribution.
  • Figure 3: Plots of the model prediction $\hat{k}$ compared to the ground truth $k^*$ in log scale. The dashed lines indicate the acceptable half-magnitude range surrounding the gold standard values. Incorrect predictions are shaded to indicate the level of magnitude of the residual errors.
  • Figure 4: Analysis of the individual components in Branch for query estimation and Bayesian model ordering. Query estimation is evaluated with percentage error against ground truth queries and answers, and model ordering is evaluated with Kendall's $\tau$ against human-made Bayesian networks.
  • Figure 5: Comparison of different $a$ hyperparameter values for the range metric across all the models tested on our dataset.
  • ...and 5 more figures