Table of Contents
Fetching ...

Distributionally Robust Safe Sample Elimination under Covariate Shift

Hiroyuki Hanada, Tatsuya Aoyama, Satoshi Akahane, Tomonari Tanaka, Yoshito Okura, Yu Inatsu, Noriaki Hashimoto, Shion Takeno, Taro Murayama, Hanju Lee, Shinya Kojima, Ichiro Takeuchi

TL;DR

The DRSSS method is proposed, which combines distributionally robust (DR) optimization and safe sample screening (SSS) to reduce storage and training costs, and focuses on covariate shift as a type of data distribution change.

Abstract

We consider a machine learning setup where one training dataset is used to train multiple models across slightly different data distributions. This occurs when customized models are needed for various deployment environments. To reduce storage and training costs, we propose the DRSSS method, which combines distributionally robust (DR) optimization and safe sample screening (SSS). The key benefit of this method is that models trained on the reduced dataset will perform the same as those trained on the full dataset for all possible different environments. In this paper, we focus on covariate shift as a type of data distribution change and demonstrate the effectiveness of our method through experiments.

Distributionally Robust Safe Sample Elimination under Covariate Shift

TL;DR

The DRSSS method is proposed, which combines distributionally robust (DR) optimization and safe sample screening (SSS) to reduce storage and training costs, and focuses on covariate shift as a type of data distribution change.

Abstract

We consider a machine learning setup where one training dataset is used to train multiple models across slightly different data distributions. This occurs when customized models are needed for various deployment environments. To reduce storage and training costs, we propose the DRSSS method, which combines distributionally robust (DR) optimization and safe sample screening (SSS). The key benefit of this method is that models trained on the reduced dataset will perform the same as those trained on the full dataset for all possible different environments. In this paper, we focus on covariate shift as a type of data distribution change and demonstrate the effectiveness of our method through experiments.
Paper Structure (34 sections, 15 theorems, 41 equations, 4 figures, 4 tables)

This paper contains 34 sections, 15 theorems, 41 equations, 4 figures, 4 tables.

Key Result

Lemma 3.1

Suppose that $\rho$ in $P_{\bm w}$ (and also $P_{\bm w}$ itself) of eq:primal are $\kappa$-strongly convex. Then, for any $\hat{\bm\beta}\in\mathbb{R}^d$ and $\hat{\bm\alpha}\in\mathbb{R}^n$, we can assure that the following ${\cal B}^{*(\bm w)}\subset\mathbb{R}^d$ must satisfy $\bm\beta^{*(\bm w)}\

Figures (4)

  • Figure 1: Schematic illustration of the proposed DRSSS method in a toy binary classification problem with a sample-sparse classifier such as SVM. (A) Suppose that we have a dataset of binary classification (classes described by colors), and we compute the proposed DRSSS in the initial training phase. Then, it can identify a set of samples that do not influence the optimal solutions in the deployment phase, regardless of any changes within a specified range in the input distribution (points with thin colors). (B) This means that, for any input distribution of the deployment phase within a specified range (depicted by the differing sizes of points), the set of samples screened out by the DRSSS method is always identified as no influence (screened samples in a specific deployment is always a superset of that in DRSSS).
  • Figure 2:
  • Figure 4: An example of the expression ${\cal T}(\nu)$ (black solid line) in Lemmas \ref{['thm:maximize-convex-quadratic']} and \ref{['lem:find-invsq']}. Colored dash lines denote terms in the summation $(\xi_{e_k}/(\nu - \phi_{e_k}))^2$. We can see that, given an interval $(\phi_{e_k}, \phi_{e_{k+1}})$ ($k\in[N-1]$), the function is convex.
  • Figure 5: Safe sample screening rates for linear- and RBF-kernel SVMs, under the settings described in Section \ref{['sec:experiment']} and Appendix \ref{['app:experimental-setup']}.

Theorems & Definitions (31)

  • Definition 2.1
  • Remark 2.2
  • Lemma 3.1
  • Lemma 3.2
  • Remark 3.3
  • Definition 3.4: DRSSS under covariate shift
  • Theorem 3.5
  • Theorem 4.1
  • Lemma A.1
  • proof
  • ...and 21 more