Table of Contents
Fetching ...

Data-driven Optimal Cost Selection for Distributionally Robust Optimization

Jose Blanchet, Yang Kang, Fan Zhang, Karthyek Murthy

TL;DR

The paper introduces a data-driven DRO framework in which the transport-cost function is learned from data via metric learning to shape the distributional uncertainty set around the empirical distribution. By integrating Mahalanobis-type metrics (and nonlinear feature mappings) into the optimal-transport discrepancy, the authors connect the learned cost to adaptive regularization, yielding representations that resemble adaptive ridge or logistic penalties. They develop a dual, smoothing-based optimization approach to solve the resulting DRO problems and demonstrate improved generalization on real datasets compared to standard regularized methods. The work provides a principled link between metric learning and adaptive regularization within DRO, with implications for better focusing robustness on regions of practical relevance.

Abstract

Recently, (Blanchet, Kang, and Murhy 2016, and Blanchet, and Kang 2017) showed that several machine learning algorithms, such as square-root Lasso, Support Vector Machines, and regularized logistic regression, among many others, can be represented exactly as distributionally robust optimization (DRO) problems. The distributional uncertainty is defined as a neighborhood centered at the empirical distribution. We propose a methodology which learns such neighborhood in a natural data-driven way. We show rigorously that our framework encompasses adaptive regularization as a particular case. Moreover, we demonstrate empirically that our proposed methodology is able to improve upon a wide range of popular machine learning estimators.

Data-driven Optimal Cost Selection for Distributionally Robust Optimization

TL;DR

The paper introduces a data-driven DRO framework in which the transport-cost function is learned from data via metric learning to shape the distributional uncertainty set around the empirical distribution. By integrating Mahalanobis-type metrics (and nonlinear feature mappings) into the optimal-transport discrepancy, the authors connect the learned cost to adaptive regularization, yielding representations that resemble adaptive ridge or logistic penalties. They develop a dual, smoothing-based optimization approach to solve the resulting DRO problems and demonstrate improved generalization on real datasets compared to standard regularized methods. The work provides a principled link between metric learning and adaptive regularization within DRO, with implications for better focusing robustness on regions of practical relevance.

Abstract

Recently, (Blanchet, Kang, and Murhy 2016, and Blanchet, and Kang 2017) showed that several machine learning algorithms, such as square-root Lasso, Support Vector Machines, and regularized logistic regression, among many others, can be represented exactly as distributionally robust optimization (DRO) problems. The distributional uncertainty is defined as a neighborhood centered at the empirical distribution. We propose a methodology which learns such neighborhood in a natural data-driven way. We show rigorously that our framework encompasses adaptive regularization as a particular case. Moreover, we demonstrate empirically that our proposed methodology is able to improve upon a wide range of popular machine learning estimators.

Paper Structure

This paper contains 14 sections, 3 theorems, 38 equations, 1 figure, 1 table, 1 algorithm.

Key Result

Theorem 1

Assume that $\Lambda \in R^{d\times d}$ in ( Cost_CA) is positive definite. Given the data set $\mathcal{D}_{n}$, we obtain the following representation Moreover, if $Y\in \left\{ -1,+1\right\}$ in the context of adaptive regularized logistic regression, we obtain the following representation

Figures (1)

  • Figure 1: Stylized examples illustrating the need for data-driven cost function.

Theorems & Definitions (6)

  • Theorem 1: DRO Representation for Generalized Adaptive Regularization
  • Theorem 2
  • Lemma 1
  • proof : Proof for Lemma \ref{['Lemma-M-Norm']}
  • proof : Proof for Theorem \ref{['Thm-DRO-Rep-Adaptive-Reg']}
  • proof : Proof of Theorem \ref{['Thm-Smooth-Approx']}