Table of Contents
Fetching ...

Optimal Model Selection for Conformalized Robust Optimization

Yajie Bao, Yang Hu, Haojie Ren, Peng Zhao, Changliang Zou

TL;DR

This work tackles model selection for Conformalized Robust Optimization (CRO) to improve downstream decision efficiency while preserving robustness. It introduces CROMS, a framework integrating conformal prediction with empirical risk minimization to select models that minimize CRO decision loss; and its variants E-CROMS, F-CROMS, and J-CROMS, offering trade-offs between asymptotic optimality and finite-sample guarantees. It extends CROMS to CROiMS for covariate-aware individualized decisions, backed by non-asymptotic and asymptotic guarantees. Empirical results on synthetic data and real medical datasets show substantial gains in decision efficiency and robustness over baseline approaches.

Abstract

In decision-making under uncertainty, Contextual Robust Optimization (CRO) provides reliability by minimizing the worst-case decision loss over a prediction set. While recent advances use conformal prediction to construct prediction sets for machine learning models, the downstream decisions critically depend on model selection. This paper introduces novel model selection frameworks for CRO that unify robustness control with decision risk minimization. We first propose Conformalized Robust Optimization with Model Selection (CROMS), a framework that selects the model to approximately minimize the averaged decision risk in CRO solutions. Given the target robustness level 1-α, we present a computationally efficient algorithm called E-CROMS, which achieves asymptotic robustness control and decision optimality. To correct the control bias in finite samples, we further develop two algorithms: F-CROMS, which ensures a 1-αrobustness but requires searching the label space; and J-CROMS, which offers lower computational cost while achieving a 1-2αrobustness. Furthermore, we extend the CROMS framework to the individualized setting, where model selection is performed by minimizing the conditional decision risk given the covariates of the test data. This framework advances conformal prediction methodology by enabling covariate-aware model selection. Numerical results demonstrate significant improvements in decision efficiency across diverse synthetic and real-world applications, outperforming baseline approaches.

Optimal Model Selection for Conformalized Robust Optimization

TL;DR

This work tackles model selection for Conformalized Robust Optimization (CRO) to improve downstream decision efficiency while preserving robustness. It introduces CROMS, a framework integrating conformal prediction with empirical risk minimization to select models that minimize CRO decision loss; and its variants E-CROMS, F-CROMS, and J-CROMS, offering trade-offs between asymptotic optimality and finite-sample guarantees. It extends CROMS to CROiMS for covariate-aware individualized decisions, backed by non-asymptotic and asymptotic guarantees. Empirical results on synthetic data and real medical datasets show substantial gains in decision efficiency and robustness over baseline approaches.

Abstract

In decision-making under uncertainty, Contextual Robust Optimization (CRO) provides reliability by minimizing the worst-case decision loss over a prediction set. While recent advances use conformal prediction to construct prediction sets for machine learning models, the downstream decisions critically depend on model selection. This paper introduces novel model selection frameworks for CRO that unify robustness control with decision risk minimization. We first propose Conformalized Robust Optimization with Model Selection (CROMS), a framework that selects the model to approximately minimize the averaged decision risk in CRO solutions. Given the target robustness level 1-α, we present a computationally efficient algorithm called E-CROMS, which achieves asymptotic robustness control and decision optimality. To correct the control bias in finite samples, we further develop two algorithms: F-CROMS, which ensures a 1-αrobustness but requires searching the label space; and J-CROMS, which offers lower computational cost while achieving a 1-2αrobustness. Furthermore, we extend the CROMS framework to the individualized setting, where model selection is performed by minimizing the conditional decision risk given the covariates of the test data. This framework advances conformal prediction methodology by enabling covariate-aware model selection. Numerical results demonstrate significant improvements in decision efficiency across diverse synthetic and real-world applications, outperforming baseline approaches.

Paper Structure

This paper contains 89 sections, 36 theorems, 241 equations, 24 figures, 11 tables, 5 algorithms.

Key Result

Theorem 2.1

Suppose data $\{(X_i,Y_i)\}_{i=1}^{n+1}$ are i.i.d., E-CROMS satisfies ${\mathbb{P}}\{Y_{n+1} \in \widehat{{\mathcal{U}}}^{\mathrm{E}\text{-}\mathrm{CROMS}}(X_{n+1})\} \geq (1+n^{-1})(1-\alpha) - 2\mathfrak{R}_n({\mathcal{F}})$. Further, the decision of $\hat{z}^{\mathrm{E}\text{-}\mathrm{CROMS}}(X_

Figures (24)

  • Figure 1: Illustration for the grid-approximated F-CROMS with ${\mathcal{Y}} \subseteq {\mathbb{R}}^2$. The red dots in panel (a) are grid points in $\widetilde{\mathcal{Y}}$, and the purple points in panel (b) are grid points in $\widetilde{{\mathcal{U}}}^{\mathrm{F}\text{-}\mathrm{CROMS}}(X_{n+1})$, and the green area is the output prediction set $\widehat{{\mathcal{U}}}^{\mathrm{GF}\text{-}\mathrm{CROMS}}(X_{n+1})$ in \ref{['eq:GFCROMS_set']}.
  • Figure 2: The evaluation metrics with confidence intervals under the classification task. The nominal level is $\alpha = 0.1$.
  • Figure 3: The average loss, worst-case conditional miscoverage, and worst-case conditional misrobustness when varying sample size $n$ in the classification task, where candidate models are trained on different covariates, $|\Lambda| = 3$ and $\alpha = 0.1$.
  • Figure 4: The group conditional losses when varying sample size $n$ in the classification task, where candidate models are trained on different covariates, $|\Lambda| = 3$ and $\alpha = 0.1$.
  • Figure 5: The average loss, marginal misrobustness, and worst-case conditional misrobustness on COVID-19 Radiography Database.
  • ...and 19 more figures

Theorems & Definitions (75)

  • Definition 1: Marginal robustness
  • Definition 2: Asymptotic optimality
  • Theorem 2.1
  • Theorem 2.2
  • Proposition 2.1
  • Remark 2.1
  • Theorem 3.1
  • Theorem 3.2
  • Theorem 3.3
  • Theorem 3.4
  • ...and 65 more