Table of Contents
Fetching ...

Representative, Informative, and De-Amplifying: Requirements for Robust Bayesian Active Learning under Model Misspecification

Roubing Tang, Sabina J. Sloman, Samuel Kaski

TL;DR

This work provides a mathematical analysis of generalization error that reveals key contributors to generalization error in the presence of model misspecification and develops a new acquisition function that mitigates the effects of model misspecification by including terms for representativeness, informativeness, and de-amplification (R-IDeA).

Abstract

In many settings in science and industry, such as drug discovery and clinical trials, a central challenge is designing experiments under time and budget constraints. Bayesian Optimal Experimental Design (BOED) is a paradigm to pick maximally informative designs that has been increasingly applied to such problems. During training, BOED selects inputs according to a pre-determined acquisition criterion to target informativeness. During testing, the model learned during training encounters a naturally occurring distribution of test samples. This leads to an instance of covariate shift, where the train and test samples are drawn from different distributions (the training samples are not representative of the test distribution). Prior work has shown that in the presence of model misspecification, covariate shift amplifies generalization error. Our first contribution is to provide a mathematical analysis of generalization error that reveals key contributors to generalization error in the presence of model misspecification. We show that generalization error under misspecification is the result of, in addition to covariate shift, a phenomenon we term error (de-)amplification which has not been identified or studied in prior work. We then develop a new acquisition function that mitigates the effects of model misspecification by including terms for representativeness, informativeness, and de-amplification (R-IDeA). Our experimental results demonstrate that the proposed method performs better than methods that target either only informativeness, representativeness, or both.

Representative, Informative, and De-Amplifying: Requirements for Robust Bayesian Active Learning under Model Misspecification

TL;DR

This work provides a mathematical analysis of generalization error that reveals key contributors to generalization error in the presence of model misspecification and develops a new acquisition function that mitigates the effects of model misspecification by including terms for representativeness, informativeness, and de-amplification (R-IDeA).

Abstract

In many settings in science and industry, such as drug discovery and clinical trials, a central challenge is designing experiments under time and budget constraints. Bayesian Optimal Experimental Design (BOED) is a paradigm to pick maximally informative designs that has been increasingly applied to such problems. During training, BOED selects inputs according to a pre-determined acquisition criterion to target informativeness. During testing, the model learned during training encounters a naturally occurring distribution of test samples. This leads to an instance of covariate shift, where the train and test samples are drawn from different distributions (the training samples are not representative of the test distribution). Prior work has shown that in the presence of model misspecification, covariate shift amplifies generalization error. Our first contribution is to provide a mathematical analysis of generalization error that reveals key contributors to generalization error in the presence of model misspecification. We show that generalization error under misspecification is the result of, in addition to covariate shift, a phenomenon we term error (de-)amplification which has not been identified or studied in prior work. We then develop a new acquisition function that mitigates the effects of model misspecification by including terms for representativeness, informativeness, and de-amplification (R-IDeA). Our experimental results demonstrate that the proposed method performs better than methods that target either only informativeness, representativeness, or both.

Paper Structure

This paper contains 67 sections, 4 theorems, 75 equations, 15 figures.

Key Result

Proposition 1

equ:Gerror can be decomposed into the following

Figures (15)

  • Figure 2: Comparison of different design strategies (Random, BAD, proposed R-I, proposed R-IDeA, and R-IDeA-oracle, which uses the true $\bar{f}$ instead of the proxy $g$) under misspecified models in both polynomial regression. Left: Generalization error across methods. Right: MMD distance across methods; higher values indicate a greater degree of covariate shift.
  • Figure 3: Comparison of baseline methods (Random, BAD) and our proposed R-I and R-IDeA in the source localization experiments. Top: Generalization error across methods. Bottom: MMD distance across methods; higher values indicate a greater degree of covariate shift.
  • Figure 4: Comparison of baseline methods (Random, BAD) and our proposed R-I and R-IDeA in the Pharmacokinetic model experiments Top: Generalization error across methods. Bottom: MMD distance across methods; higher values indicate a greater degree of covariate shift.
  • Figure 5: Comparison of different design strategies (Random, BAD, proposed R-I, proposed R-IDeA, and R-IDeA-oracle under well-specified models in polynomial regression. Left: Generalization error across methods. Right: MMD distance across methods; higher values indicate a greater degree of covariate shift.
  • Figure 6: Comparison of baseline methods (Random, BAD) and our proposed R-I with varying $\lambda$ in the polynomial regression experiments Left: Generalization error across methods. Right: MMD distance across methods; higher values indicate a greater degree of covariate shift.
  • ...and 10 more figures

Theorems & Definitions (11)

  • Definition 1: Model misspecification
  • Definition 2: Generalization error ($R_{\mathrm{test}})$
  • Proposition 1: Generalization Error Decomposition
  • Definition 3: The degree of covariate shift
  • Definition 4: The degree of misspecification
  • Theorem 1: Generalization Error Bound under Covariate Shift with Amplification
  • Remark 1: Connection to \ref{['prop:gen_de']}
  • Remark 2
  • Remark 3
  • Theorem 2: Approximate de-amplifying region
  • ...and 1 more