Reweighting Improves Conditional Risk Bounds

Yikai Zhang; Jiahe Lin; Fengpei Li; Songzhu Zheng; Anant Raj; Anderson Schneider; Yuriy Nevmyvaka

Reweighting Improves Conditional Risk Bounds

Yikai Zhang, Jiahe Lin, Fengpei Li, Songzhu Zheng, Anant Raj, Anderson Schneider, Yuriy Nevmyvaka

TL;DR

It is shown that under a general ``balanceable"Bernstein condition, one can design a weighted ERM estimator to achieve superior performance in certain sub-regions over the one obtained from standard ERM, and the superiority manifests itself through a data-dependent constant term in the error bound.

Abstract

In this work, we study the weighted empirical risk minimization (weighted ERM) schema, in which an additional data-dependent weight function is incorporated when the empirical risk function is being minimized. We show that under a general ``balanceable" Bernstein condition, one can design a weighted ERM estimator to achieve superior performance in certain sub-regions over the one obtained from standard ERM, and the superiority manifests itself through a data-dependent constant term in the error bound. These sub-regions correspond to large-margin ones in classification settings and low-variance ones in heteroscedastic regression settings, respectively. Our findings are supported by evidence from synthetic data experiments.

Reweighting Improves Conditional Risk Bounds

TL;DR

Abstract

Paper Structure (32 sections, 12 theorems, 124 equations, 2 figures, 1 table)

This paper contains 32 sections, 12 theorems, 124 equations, 2 figures, 1 table.

Introduction
Contribution
Related Work
Empirical risk minimization
Maximum likelihood estimation and weighted ERM
Reject option and selective risk
A Road Map
Notation
Problem setup
Main Results
Classification with/without margin condition
Selective inference in practice
Bounded heteroscedastic regression
General case
Synthetic Data Experiments
...and 17 more sections

Key Result

Theorem 4.1

Suppose that we have $\widehat{\omega}(\cdot) \in {\mathcal{W}}$ s.t. $\mathbb{E}_{{\boldsymbol x}}[(\widehat{\omega}({\boldsymbol x}) - \omega^*({\boldsymbol x}))^2] \leq \varepsilon$ is satisfied. Let $S_n = \{(\boldsymbol x_i,y_i)\}_{i=1}^{n}$ be i.i.d. samples drawn according to the DGP describe provided that the sample size $n$ satisfies $n \gtrsim \frac{ d_{VC}({\mathcal{F}}) \log(\frac{1}{\

Figures (2)

Figure 1: Regression setting: underlying true data, estimates from ERM and weighted ERM, and the selective risk
Figure 2: Classification setting: underlying true data, estimates from ERM and weighted ERM and the selective risk

Theorems & Definitions (30)

Definition 1: Empirical risk and the ERM estimator
Definition 2: Weighted empirical risk and the weighted ERM estimator
Theorem 4.1: Risk Bound for the case of Classification
Theorem 4.2
Remark 1: On the bounds established
Theorem 4.3: Risk bound for estimating $\omega^*$
Remark 2
Corollary 1
Theorem 4.4
Theorem 4.5
...and 20 more

Reweighting Improves Conditional Risk Bounds

TL;DR

Abstract

Reweighting Improves Conditional Risk Bounds

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (30)