Table of Contents
Fetching ...

Distributionally Robust Safe Screening

Hiroyuki Hanada, Satoshi Akahane, Tatsuya Aoyama, Tomonari Tanaka, Yoshito Okura, Yu Inatsu, Noriaki Hashimoto, Taro Murayama, Lee Hanju, Shinya Kojima, Ichiro Takeuchi

TL;DR

This work tackles the problem of identifying redundant samples and features in supervised learning under covariate shift with unknown test distributions. It introduces Distributionally Robust Safe Screening (DRSS), a framework that extends safe screening to weighted empirical risk minimization where weights may change within a predefined set, and provides tight guarantees via duality-gap bounds. The approach yields DRSS rules for both samples (DRSsS) and features (DRSfS), and develops concrete algorithms for typical ML setups (e.g., L1/L2-regularized SVMs) including a method to maximize convex quadratic forms over a hyperball. It also extends to deep learning by applying screening to the last layer, and validates the method through numerical experiments on synthetic and LIBSVM datasets as well as a DL example, demonstrating robust screening under weight perturbations. Overall, DRSS offers a practical tool to reduce computation and storage while maintaining performance under distributional uncertainty, with broad applicability from classical convex models to deep learning settings.

Abstract

In this study, we propose a method Distributionally Robust Safe Screening (DRSS), for identifying unnecessary samples and features within a DR covariate shift setting. This method effectively combines DR learning, a paradigm aimed at enhancing model robustness against variations in data distribution, with safe screening (SS), a sparse optimization technique designed to identify irrelevant samples and features prior to model training. The core concept of the DRSS method involves reformulating the DR covariate-shift problem as a weighted empirical risk minimization problem, where the weights are subject to uncertainty within a predetermined range. By extending the SS technique to accommodate this weight uncertainty, the DRSS method is capable of reliably identifying unnecessary samples and features under any future distribution within a specified range. We provide a theoretical guarantee of the DRSS method and validate its performance through numerical experiments on both synthetic and real-world datasets.

Distributionally Robust Safe Screening

TL;DR

This work tackles the problem of identifying redundant samples and features in supervised learning under covariate shift with unknown test distributions. It introduces Distributionally Robust Safe Screening (DRSS), a framework that extends safe screening to weighted empirical risk minimization where weights may change within a predefined set, and provides tight guarantees via duality-gap bounds. The approach yields DRSS rules for both samples (DRSsS) and features (DRSfS), and develops concrete algorithms for typical ML setups (e.g., L1/L2-regularized SVMs) including a method to maximize convex quadratic forms over a hyperball. It also extends to deep learning by applying screening to the last layer, and validates the method through numerical experiments on synthetic and LIBSVM datasets as well as a DL example, demonstrating robust screening under weight perturbations. Overall, DRSS offers a practical tool to reduce computation and storage while maintaining performance under distributional uncertainty, with broad applicability from classical convex models to deep learning settings.

Abstract

In this study, we propose a method Distributionally Robust Safe Screening (DRSS), for identifying unnecessary samples and features within a DR covariate shift setting. This method effectively combines DR learning, a paradigm aimed at enhancing model robustness against variations in data distribution, with safe screening (SS), a sparse optimization technique designed to identify irrelevant samples and features prior to model training. The core concept of the DRSS method involves reformulating the DR covariate-shift problem as a weighted empirical risk minimization problem, where the weights are subject to uncertainty within a predetermined range. By extending the SS technique to accommodate this weight uncertainty, the DRSS method is capable of reliably identifying unnecessary samples and features under any future distribution within a specified range. We provide a theoretical guarantee of the DRSS method and validate its performance through numerical experiments on both synthetic and real-world datasets.
Paper Structure (34 sections, 14 theorems, 57 equations, 6 figures, 2 tables)

This paper contains 34 sections, 14 theorems, 57 equations, 6 figures, 2 tables.

Key Result

Lemma 3.1

Suppose that $\rho$ in $P_{\bm w}$ (and also $P_{\bm w}$ itself) of eq:primal are $\kappa$-strongly convex. Then, for any $\hat{\bm\beta}\in\mathbb{R}^d$ and $\hat{\bm\alpha}\in\mathbb{R}^n$, we can assure $\bm\beta^{*(\bm w)}\in{\cal B}^{*(\bm w)}$ by taking

Figures (6)

  • Figure 1: Schematic illustration of the proposed Distributionally Robust Safe Screening (DRSS) method. Panel A displays the training samples, each assigned equal weight, as indicated by the uniform size of the points. Panel B depicts various unknown test distributions, highlighting how the significance of training samples varies with different realizations of the test distribution. Panel C shows the outcomes of safe sample screening (SsS) across multiple realizations of test distributions. Finally, Panel D presents the results of the proposed DRSS method, demonstrating its capability to identify redundant samples regardless of the observed test distribution.
  • Figure 2: An example of the expression ${\cal T}(\nu)$ (black solid line) in Lemmas \ref{['lem:maximize-convex-quadratic']} and \ref{['lem:find-invsq']}. Colored dash lines denote terms in the summation $(\xi_{e_k}/(\nu - \phi_{e_k}))^2$. We can see that, given an interval $(\phi_{e_k}, \phi_{e_{k+1}})$ ($k\in[N-1]$), the function is convex.
  • Figure 3: Concept of how to apply SS for deep learning. SS is applied to the last layer for the final prediction.
  • Figure 4: Ratio of screened samples by DRSsS for dataset "sonar".
  • Figure 7: Ratios of screened samples by DRSsS.
  • ...and 1 more figures

Theorems & Definitions (30)

  • Definition 2.1
  • Remark 2.2
  • Lemma 3.1
  • Lemma 3.2
  • Lemma 3.3
  • Lemma 3.4
  • Definition 3.5: weight-changing safe screening (WCSS)
  • Definition 3.6: Distributionally robust safe screening (DRSS)
  • Theorem 3.7
  • Lemma 4.1
  • ...and 20 more