Table of Contents
Fetching ...

Towards Privacy-Aware Bayesian Networks: A Credal Approach

Niccolò Rocchi, Fabio Stella, Cassio de Campos

TL;DR

Public Bayesian networks learned from private data pose tracing-attack risks. The authors introduce credal networks (CNs) to balance privacy and utility by injecting epistemic uncertainty into BN parameters via credal sets and a strong extension. They show, both theoretically and empirically, that releasing a CN reduces tracing-attack power while preserving meaningful inferential bounds, with privacy tunable through the CN hyperparameter S and the choice of credal-set specification. The approach is particularly relevant to privacy-sensitive domains such as healthcare and federated learning, offering a practical alternative to noisy sanitization that preserves utility.

Abstract

Bayesian networks (BN) are probabilistic graphical models that enable efficient knowledge representation and inference. These have proven effective across diverse domains, including healthcare, bioinformatics and economics. The structure and parameters of a BN can be obtained by domain experts or directly learned from available data. However, as privacy concerns escalate, it becomes increasingly critical for publicly released models to safeguard sensitive information in training data. Typically, released models do not prioritize privacy by design. In particular, tracing attacks from adversaries can combine the released BN with auxiliary data to determine whether specific individuals belong to the data from which the BN was learned. State-of-the-art protection tecniques involve introducing noise into the learned parameters. While this offers robust protection against tracing attacks, it significantly impacts the model's utility, in terms of both the significance and accuracy of the resulting inferences. Hence, high privacy may be attained at the cost of releasing a possibly ineffective model. This paper introduces credal networks (CN) as a novel solution for balancing the model's privacy and utility. After adapting the notion of tracing attacks, we demonstrate that a CN enables the masking of the learned BN, thereby reducing the probability of successful attacks. As CNs are obfuscated but not noisy versions of BNs, they can achieve meaningful inferences while safeguarding privacy. Moreover, we identify key learning information that must be concealed to prevent attackers from recovering the underlying BN. Finally, we conduct a set of numerical experiments to analyze how privacy gains can be modulated by tuning the CN hyperparameters. Our results confirm that CNs provide a principled, practical, and effective approach towards the development of privacy-aware probabilistic graphical models.

Towards Privacy-Aware Bayesian Networks: A Credal Approach

TL;DR

Public Bayesian networks learned from private data pose tracing-attack risks. The authors introduce credal networks (CNs) to balance privacy and utility by injecting epistemic uncertainty into BN parameters via credal sets and a strong extension. They show, both theoretically and empirically, that releasing a CN reduces tracing-attack power while preserving meaningful inferential bounds, with privacy tunable through the CN hyperparameter S and the choice of credal-set specification. The approach is particularly relevant to privacy-sensitive domains such as healthcare and federated learning, offering a practical alternative to noisy sanitization that preserves utility.

Abstract

Bayesian networks (BN) are probabilistic graphical models that enable efficient knowledge representation and inference. These have proven effective across diverse domains, including healthcare, bioinformatics and economics. The structure and parameters of a BN can be obtained by domain experts or directly learned from available data. However, as privacy concerns escalate, it becomes increasingly critical for publicly released models to safeguard sensitive information in training data. Typically, released models do not prioritize privacy by design. In particular, tracing attacks from adversaries can combine the released BN with auxiliary data to determine whether specific individuals belong to the data from which the BN was learned. State-of-the-art protection tecniques involve introducing noise into the learned parameters. While this offers robust protection against tracing attacks, it significantly impacts the model's utility, in terms of both the significance and accuracy of the resulting inferences. Hence, high privacy may be attained at the cost of releasing a possibly ineffective model. This paper introduces credal networks (CN) as a novel solution for balancing the model's privacy and utility. After adapting the notion of tracing attacks, we demonstrate that a CN enables the masking of the learned BN, thereby reducing the probability of successful attacks. As CNs are obfuscated but not noisy versions of BNs, they can achieve meaningful inferences while safeguarding privacy. Moreover, we identify key learning information that must be concealed to prevent attackers from recovering the underlying BN. Finally, we conduct a set of numerical experiments to analyze how privacy gains can be modulated by tuning the CN hyperparameters. Our results confirm that CNs provide a principled, practical, and effective approach towards the development of privacy-aware probabilistic graphical models.

Paper Structure

This paper contains 16 sections, 7 theorems, 19 equations, 1 figure, 1 table.

Key Result

Theorem 1

For any error level $\alpha$, it holds: where $z_{s}$, $0 < s <1$, is the quantile at level $1-s$ of the Standard Normal distribution $\mathcal{N}(0,1)$.

Figures (1)

  • Figure 1: Error vs. power rates for experimental tracing attacks across various node (rows) and edge (columns) configurations. For each setup, $\mathcal{T}$ and $\mathcal{R}$ are sampled 200 times from $\mathcal{P}$. Lines (solid or dashed) represent average power, while shaded areas indicate the maximum power achieved across those samples.

Theorems & Definitions (18)

  • Definition 1: BN
  • Definition 2: Credal set
  • Definition 3: Locally specified CNs
  • Definition 4: Strong extension
  • Remark 1
  • Definition 5: $\varepsilon$-contamination
  • Theorem 1: Murakonda_2021
  • Definition 6: Tracing attack against CNs
  • Remark 2
  • Lemma 1
  • ...and 8 more