Table of Contents
Fetching ...

A note on conditional densities, Bayes' rule, and recent criticisms of Bayesian inference

Alex Yan, Cathal Mills, Augustin Marignier, Younjung Kim, Ben Lambert

Abstract

When performing Bayesian inference, we frequently need to work with conditional probability densities. For example, the posterior function is the conditional density of the parameters given the data. Some might worry that conditional densities are ill-defined, considering that for a continuous random variable $Y$, the event $\{Y=y\}$ has probability zero, meaning the formula $\mathbb{P}(A|B)=\mathbb{P}(A\cap B)/\mathbb{P}(B)$ is inapplicable. In reality, when we work with conditional densities, we never condition directly on the zero-probability event $\{Y=y\}$; rather, we first condition on the random variable $Y$, and then we may plug in an observed value $y$. The first purpose of our article is to provide an exposition on conditional densities that elaborates on this point. While we have aimed to make this explanation accessible, we follow it with a roadmap of the measure theory needed to make it rigorous. A recent preprint (arXiv:2411.13570) has expressed the concern that probability densities are ill-defined and that as a result Bayes' theorem cannot be used, and they provide examples that allegedly demonstrate inconsistencies in the Bayesian framework. The second purpose of our article is to investigate their claims. We contend that the examples given in their work do not demonstrate any inconsistencies; we find that there are mathematical errors and that they deviate significantly from the Bayesian framework.

A note on conditional densities, Bayes' rule, and recent criticisms of Bayesian inference

Abstract

When performing Bayesian inference, we frequently need to work with conditional probability densities. For example, the posterior function is the conditional density of the parameters given the data. Some might worry that conditional densities are ill-defined, considering that for a continuous random variable , the event has probability zero, meaning the formula is inapplicable. In reality, when we work with conditional densities, we never condition directly on the zero-probability event ; rather, we first condition on the random variable , and then we may plug in an observed value . The first purpose of our article is to provide an exposition on conditional densities that elaborates on this point. While we have aimed to make this explanation accessible, we follow it with a roadmap of the measure theory needed to make it rigorous. A recent preprint (arXiv:2411.13570) has expressed the concern that probability densities are ill-defined and that as a result Bayes' theorem cannot be used, and they provide examples that allegedly demonstrate inconsistencies in the Bayesian framework. The second purpose of our article is to investigate their claims. We contend that the examples given in their work do not demonstrate any inconsistencies; we find that there are mathematical errors and that they deviate significantly from the Bayesian framework.

Paper Structure

This paper contains 28 sections, 5 theorems, 34 equations, 2 figures.

Key Result

Proposition 1

Such a random variable always exists, and if another random variable $Z$ satisfies the definition, then $Z=\mathop{\mathrm{\mathbb{E}}}\nolimits[X|\mathcal{G}]$ almost surely.

Figures (2)

  • Figure 1: A model with two distinct parameters (left) is qualitatively different from a model with a shared parameter (right).
  • Figure 2: A graphical illustration of the model. The shaded node represents an observed quantity, i.e. data. All unshaded nodes represent unknown quantities, i.e. parameters; one parameter ($\delta$) may be called a hyperparameter and two ($\delta,\lambda$) might be thought of as nuisance parameters. The dependencies in the model specification are shown with arrows.

Theorems & Definitions (6)

  • Proposition 1: see Kallenberg2010ed2, Theorem 6.1
  • Proposition 2: see Kallenberg2010ed2, Theorem 6.3
  • Proposition 3: Radon--Nikodym theorem, see Kallenberg2010ed2, Theorem 2.10
  • Proposition 4
  • Proposition 5: Tonelli's theorem, see Billingsley1986, Theorem 18.3
  • proof : Proof of Proposition \ref{['prop:densities']}