Identifiability and Falsifiability: Two Challenges for Bayesian Model Expansion

Collin Cademartori

Identifiability and Falsifiability: Two Challenges for Bayesian Model Expansion

Collin Cademartori

TL;DR

This work analyzes how Bayesian model expansion impacts identifiability and falsifiability of inferences, introducing identifiability mutual information (IMI) and falsifiability mutual information (FMI) as principled proxies. A central result shows that, when an expanded model becomes sufficiently complex, there is a trade-off: improving identifiability can come at the cost of weaker falsifiability, or vice versa, unless prior structure constrains expansion. The authors formalize this via an information-theoretic framework that includes an expansion ratio and a contraction coefficient, and they illustrate the theory with three worked examples: linear regression, unknown variance, and hierarchical priors. They also discuss practical implications for prior design, such as using prior expansions or penalized complexity priors to limit the detrimental effects of expansion on IMI and FMI, while acknowledging limitations related to computation and misspecification. Overall, the paper clarifies when and how model expansion can be advantageous or risky, and provides guidance for constructing expansions that preserve inferential and predictive validity.

Abstract

We study the identifiability of parameters and falsifiability of predictions under the process of model expansion in a Bayesian setting. Identifiability is represented by the closeness of the posterior to the prior distribution and falsifiability by the power of posterior predictive tests against alternatives. To study these two concepts formally, we develop information-theoretic proxies, which we term the identifiability and falsifiability mutual information. We argue that these are useful indicators, with lower values indicating a risk of poor parameter inference and underpowered model checks, respectively. Our main result establishes that a sufficiently complex expansion of a base statistical model forces a trade-off between these two mutual information quantities -- at least one of the two must decrease relative to the base model. We illustrate our result in three worked examples and extract implications for model expansion in practice. In particular, we show as an implication of our result that the negative impacts of model expansion can be limited by offsetting complexity in the likelihood with sufficiently constraining prior distributions.

Identifiability and Falsifiability: Two Challenges for Bayesian Model Expansion

TL;DR

Abstract

Paper Structure (50 sections, 11 theorems, 161 equations, 7 figures)

This paper contains 50 sections, 11 theorems, 161 equations, 7 figures.

Introduction
Model Expansion
Identifiability and Falsifiability
Identifiability
Falsifiability
Related Work
Outline of Paper
Notation
A Toy Regression Example
The Trade-Off Between Identifiability and Falsifiability
Information-Theoretic Background
Definitions of Quantities for Theorem \ref{['thm:tradeoff']}
Identifiability Measure
Falsifiability Measure
Expansion Measure
...and 35 more sections

Key Result

Lemma 1

where the supremum is taken over all measurable test statistics $T$.

Figures (7)

Figure 1: 80% quantile bands for the standard deviation ratio SR (left) and the power deficit ratio PR (right) against $\pi$. Identification increases with $\pi$, nearly matching the base model at $\pi =1$. Falsifiability falls with $\pi$, nearly matching the base model at $\pi=1$.
Figure 2: Left: Comparison function $\delta(\pi, \tau)$ for $\tau = 1/4$. We see a greater fall in IMI when $\delta <0$, and a greater fall in FMI when $\delta > 0$. Right: Change in IMI (red) and FMI (blue) from base to expanded model, plotted against $\pi(x_{k+1}, 0.25 / 2)$ in the special case of orthonormal $\mathbb{X}_{\mathrm{base}}$. When $\tau = \frac{1}{4}$, $\pi(x_{k+1}, \tau / 2)$ is constrained to $\left[1/9, 1\right]$.
Figure 3: Change in IMI (red) and FMI (blue) from base to expanded model against $\mu_{\Lambda}$. IMI increases relative to the base model for $\mu_{\Lambda}$ smaller and decreases for $\mu_{\Lambda}$ larger, whereas FMI increases relative to base model for $\mu_{\Lambda}$ larger and decreases for $\mu_{\Lambda}$ smaller.
Figure 4: $R\left(f_{\mathrm{base}}, f\right)$ (red) and $\eta^*_{f}$ (blue) against $\mu_{\Lambda} = \mathbb{E} \mathrm{Var}([Y]_i\mid\Theta, \Lambda)$. For $\mu_{\Lambda}$ sufficiently small, $R\left(f_{\mathrm{base}}, f\right) > \eta_{f}$, and Theorem \ref{['thm:tradeoff']} implies a trade-off between the IMI and FMI. For larger $\mu_{\Lambda}$, $R\left(f_{\mathrm{base}}, f\right) < \eta_{f}$, and Theorem \ref{['thm:tradeoff']} does not apply.
Figure 5: Change in IMI (red) and FMI (blue) from base to expanded model against $\mu_{\Lambda}$. For $\mu_{\Lambda}$ large enough, both the IMI and FMI improve from base to expanded model.
...and 2 more figures

Theorems & Definitions (26)

Definition 1: Model Expansion
Definition 2: Identifiability Mutual Information
Definition 3: Posterior Sampling Divergence
Definition 4: Falsifiability Mutual Information
Lemma 1
proof
Definition 5: Expansion Ratio
Definition 6: Contraction Coefficient
Theorem 1: Identifiability - Falsifiability Trade-off
proof
...and 16 more

Identifiability and Falsifiability: Two Challenges for Bayesian Model Expansion

TL;DR

Abstract

Identifiability and Falsifiability: Two Challenges for Bayesian Model Expansion

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (26)