Table of Contents
Fetching ...

Total robustness in Bayesian Nonlinear Regression

Mengqi Chen, Charita Dellaporta, Thomas B. Berrett, Theodoros Damoulas

Abstract

Modern regression analyses are often undermined by covariate measurement error, misspecification of the regression model, and misspecification of the measurement error distribution. We present, to the best of our knowledge, the first Bayesian nonparametric learning framework targeting total robustness to all three challenges in general nonlinear regression. Our framework places a joint Dirichlet process prior on the latent covariate--response distribution and updates it with posterior pseudo-samples of the latent covariates, so that inference is calibrated to the joint law. This yields estimators defined by minimizing the discrepancy between posterior realizations of the joint Dirichlet process and the model-implied joint distribution. We establish generalization bounds and provide a first proof of convergence and consistency of the resulting estimators under non-degenerate measurement error. A gradient-based implementation enables efficient computation; simulations and two real-data studies show improved stability to misspecification under increasing measurement error relative to recent Bayesian and frequentist alternatives.

Total robustness in Bayesian Nonlinear Regression

Abstract

Modern regression analyses are often undermined by covariate measurement error, misspecification of the regression model, and misspecification of the measurement error distribution. We present, to the best of our knowledge, the first Bayesian nonparametric learning framework targeting total robustness to all three challenges in general nonlinear regression. Our framework places a joint Dirichlet process prior on the latent covariate--response distribution and updates it with posterior pseudo-samples of the latent covariates, so that inference is calibrated to the joint law. This yields estimators defined by minimizing the discrepancy between posterior realizations of the joint Dirichlet process and the model-implied joint distribution. We establish generalization bounds and provide a first proof of convergence and consistency of the resulting estimators under non-degenerate measurement error. A gradient-based implementation enables efficient computation; simulations and two real-data studies show improved stability to misspecification under increasing measurement error relative to recent Bayesian and frequentist alternatives.

Paper Structure

This paper contains 47 sections, 14 theorems, 293 equations, 8 figures, 8 tables, 1 algorithm.

Key Result

Theorem 1

Under Assumptions ass:G1-ass:G2:

Figures (8)

  • Figure 1: Graphical representation of the regression structure without ME and under the two canonical ME mechanisms considered in this paper. In classical ME, the observed covariate is a noisy measurement of the latent covariate; in Berkson ME, the latent covariate is a perturbation of the observed value.
  • Figure 2: Two DP constructions under Berkson ME. Dotted arrows denote prior-driven components; dashed arrows denote posterior-updated components. Panel (a) shows our framework, where the DP is built on the joint law and updated using pseudo-samples $\tilde{X}$ informed by $W$ and $Y$. Panel (b) shows the construction of dellaporta2023robust, where latent covariates are sampled from $W$ alone and then paired with $Y$, so the DP reference distribution breaks the covariate--response dependence.
  • Figure 3: Illustrative samples for the sigmoid model under ME and 10% Huber contamination.
  • Figure 4: RMSE comparison for the sigmoid model under misspecification. ME denotes $\sigma_N$. Blue: NPL--HMC; yellow: Robust--MEM (shortened as R--MEM); green: NLS; orange: HMC.
  • Figure 5: Estimated curves for the LIDAR data under a range of contamination ratios $r_Y$, with $K_{\text{bins}}=20$ in the Berkson construction. NLS (blue), SIMEX (green), NPL--HMC (red); 95% bands are shaded; the dashed line is the oracle fit based on latent $X$.
  • ...and 3 more figures

Theorems & Definitions (34)

  • Definition 1: MMD with characteristic kernel
  • Definition 2
  • Remark 1
  • Theorem 1: Generalization error bound
  • Remark 2
  • Remark 3
  • Theorem 2
  • Proposition 1: Classical ME: Marginal--$X$
  • Proposition 2: Berkson ME
  • Lemma 1
  • ...and 24 more