Data-driven Optimal Cost Selection for Distributionally Robust Optimization

Jose Blanchet; Yang Kang; Fan Zhang; Karthyek Murthy

Data-driven Optimal Cost Selection for Distributionally Robust Optimization

Jose Blanchet, Yang Kang, Fan Zhang, Karthyek Murthy

TL;DR

The paper introduces a data-driven DRO framework in which the transport-cost function is learned from data via metric learning to shape the distributional uncertainty set around the empirical distribution. By integrating Mahalanobis-type metrics (and nonlinear feature mappings) into the optimal-transport discrepancy, the authors connect the learned cost to adaptive regularization, yielding representations that resemble adaptive ridge or logistic penalties. They develop a dual, smoothing-based optimization approach to solve the resulting DRO problems and demonstrate improved generalization on real datasets compared to standard regularized methods. The work provides a principled link between metric learning and adaptive regularization within DRO, with implications for better focusing robustness on regions of practical relevance.

Abstract

Recently, (Blanchet, Kang, and Murhy 2016, and Blanchet, and Kang 2017) showed that several machine learning algorithms, such as square-root Lasso, Support Vector Machines, and regularized logistic regression, among many others, can be represented exactly as distributionally robust optimization (DRO) problems. The distributional uncertainty is defined as a neighborhood centered at the empirical distribution. We propose a methodology which learns such neighborhood in a natural data-driven way. We show rigorously that our framework encompasses adaptive regularization as a particular case. Moreover, we demonstrate empirically that our proposed methodology is able to improve upon a wide range of popular machine learning estimators.

Data-driven Optimal Cost Selection for Distributionally Robust Optimization

TL;DR

Abstract

Data-driven Optimal Cost Selection for Distributionally Robust Optimization

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (6)