Table of Contents
Fetching ...

Brenier Isotonic Regression

Han Bao, Amirreza Eshraghi, Yutong Wang

Abstract

Isotonic regression (IR) is shape-constrained regression to maintain a univariate fitting curve non-decreasing, which has numerous applications including single-index models and probability calibration. When it comes to multi-output regression, the classical IR is no longer applicable because the monotonicity is not readily extendable. We consider a novel multi-output regression problem where a regression function is \emph{cyclically monotone}. Roughly speaking, a cyclically monotone function is the gradient of some convex potential. Whereas enforcing cyclic monotonicity is apparently challenging, we leverage the fact that Kantorovich's optimal transport (OT) always yields a cyclically monotone coupling as an optimal solution. This perspective naturally allows us to interpret a regression function and the convex potential as a link function in generalized linear models and Brenier's potential in OT, respectively, and hence we call this IR extension \emph{Brenier isotonic regression}. We demonstrate experiments with probability calibration and generalized linear models. In particular, IR outperforms many famous baselines in probability calibration robustly.

Brenier Isotonic Regression

Abstract

Isotonic regression (IR) is shape-constrained regression to maintain a univariate fitting curve non-decreasing, which has numerous applications including single-index models and probability calibration. When it comes to multi-output regression, the classical IR is no longer applicable because the monotonicity is not readily extendable. We consider a novel multi-output regression problem where a regression function is \emph{cyclically monotone}. Roughly speaking, a cyclically monotone function is the gradient of some convex potential. Whereas enforcing cyclic monotonicity is apparently challenging, we leverage the fact that Kantorovich's optimal transport (OT) always yields a cyclically monotone coupling as an optimal solution. This perspective naturally allows us to interpret a regression function and the convex potential as a link function in generalized linear models and Brenier's potential in OT, respectively, and hence we call this IR extension \emph{Brenier isotonic regression}. We demonstrate experiments with probability calibration and generalized linear models. In particular, IR outperforms many famous baselines in probability calibration robustly.
Paper Structure (20 sections, 8 theorems, 47 equations, 8 figures, 7 tables, 1 algorithm)

This paper contains 20 sections, 8 theorems, 47 equations, 8 figures, 7 tables, 1 algorithm.

Key Result

Proposition 1

There exists an optimal solution for the discrete Kantorovich problem equation:discrete_kantorovich, $\mathbf{P}^{\sigma^*}$, which is a permutation matrix associated to an optimal permutation $\sigma^*\in\mathrm{Perm}(n)$ to the discrete Monge problem equation:discrete_monge. Moreover, if $\left\lb

Figures (8)

  • Figure 1: Comparison between IR and BIR. Observations are generated with $y=1/(1+\exp(-z))+\text{noise}$. The real line of BIR is computed via the Laguerre map over $\mathbb{R}$.
  • Figure 2: scipy implementation of BrenierIR. Full code at https://github.com/levelfour/Brenier_Isotonic_Regression.
  • Figure 3: From left to right, BrenierIR ($k=50$), Binning, and Matrix Scaling. We show their estimated calibration maps as vector fields (each left) and contour plots of their first coordinate (each right).
  • Figure 4: Each base model is recalibrated by BIR with different bin size $k$ (shown with the standard deviation).
  • Figure 5: Calibration maps of nonparametric recalibrators. The base model is MLP and balance-scale dataset is used. In each row (corresponding to one recalibrator), we show the calibration map (vector field), the first/second/third coordinates of the calibration map, respectively, from left to right.
  • ...and 3 more figures

Theorems & Definitions (13)

  • Proposition 1: Panaretos2020
  • Definition 1
  • Definition 2
  • Proposition 2: Rockafellar1966
  • Proposition 3: Villani2008
  • Proposition 4: Brenier1991
  • Theorem 1
  • Theorem 2: Cyclic monotonicity of $T_{\mathbf{P}^*}$
  • Theorem 3: Cyclic monotonicity of Laguerre map
  • Definition 3
  • ...and 3 more