Table of Contents
Fetching ...

Optimal Regularization Under Uncertainty: Distributional Robustness and Convexity Constraints

Oscar Leong, Eliza O'Reilly, Yong Sheng Soh

TL;DR

The paper develops a distributionally robust optimization framework for regularizers, treating the regularizer as a gauge of a star body $K$ with normalization $\mathrm{vol}(K)=1$ and studying $\min_K\{\max_{d_W(Q,P)\leq \epsilon} \mathbb{E}_Q[\|\mathbf{x}\|_K]\}$. It shows a convex-duality reformulation that eliminates the inner maximization, analyzes how the robustness parameter $\epsilon$ and the Wasserstein cost shape the regularizer (including a Lipschitz penalty $\epsilon\,\mathrm{Lip}(K)$ in the Wasserstein-1 case), and proves existence of minimizers for $\epsilon>0$. The authors also address enforcing convexity of the optimal regularizer, providing finite-dimensional convex programs in $\mathbb{R}^2$ and several numerical examples that connect distributional shifts to regularizer geometry. Extensions to critic-based regularizers and alternative proofs illustrate the framework’s flexibility and potential for robust deployment across inverse problems. Overall, the work offers both theoretical foundations and practical computational tools for designing regularizers that are reliable under model uncertainty and convexity constraints.

Abstract

Regularization is a central tool for addressing ill-posedness in inverse problems and statistical estimation, with the choice of a suitable penalty often determining the reliability and interpretability of downstream solutions. While recent work has characterized optimal regularizers for well-specified data distributions, practical deployments are often complicated by distributional uncertainty and the need to enforce structural constraints such as convexity. In this paper, we introduce a framework for distributionally robust optimal regularization, which identifies regularizers that remain effective under perturbations of the data distribution. Our approach leverages convex duality to reformulate the underlying distributionally robust optimization problem, eliminating the inner maximization and yielding formulations that are amenable to numerical computation. We show how the resulting robust regularizers interpolate between memorization of the training distribution and uniform priors, providing insights into their behavior as robustness parameters vary. For example, we show how certain ambiguity sets, such as those based on the Wasserstein-1 distance, naturally induce regularity in the optimal regularizer by promoting regularizers with smaller Lipschitz constants. We further investigate the setting where regularizers are required to be convex, formulating a convex program for their computation and illustrating their stability with respect to distributional shifts. Taken together, our results provide both theoretical and computational foundations for designing regularizers that are reliable under model uncertainty and structurally constrained for robust deployment.

Optimal Regularization Under Uncertainty: Distributional Robustness and Convexity Constraints

TL;DR

The paper develops a distributionally robust optimization framework for regularizers, treating the regularizer as a gauge of a star body with normalization and studying . It shows a convex-duality reformulation that eliminates the inner maximization, analyzes how the robustness parameter and the Wasserstein cost shape the regularizer (including a Lipschitz penalty in the Wasserstein-1 case), and proves existence of minimizers for . The authors also address enforcing convexity of the optimal regularizer, providing finite-dimensional convex programs in and several numerical examples that connect distributional shifts to regularizer geometry. Extensions to critic-based regularizers and alternative proofs illustrate the framework’s flexibility and potential for robust deployment across inverse problems. Overall, the work offers both theoretical foundations and practical computational tools for designing regularizers that are reliable under model uncertainty and convexity constraints.

Abstract

Regularization is a central tool for addressing ill-posedness in inverse problems and statistical estimation, with the choice of a suitable penalty often determining the reliability and interpretability of downstream solutions. While recent work has characterized optimal regularizers for well-specified data distributions, practical deployments are often complicated by distributional uncertainty and the need to enforce structural constraints such as convexity. In this paper, we introduce a framework for distributionally robust optimal regularization, which identifies regularizers that remain effective under perturbations of the data distribution. Our approach leverages convex duality to reformulate the underlying distributionally robust optimization problem, eliminating the inner maximization and yielding formulations that are amenable to numerical computation. We show how the resulting robust regularizers interpolate between memorization of the training distribution and uniform priors, providing insights into their behavior as robustness parameters vary. For example, we show how certain ambiguity sets, such as those based on the Wasserstein-1 distance, naturally induce regularity in the optimal regularizer by promoting regularizers with smaller Lipschitz constants. We further investigate the setting where regularizers are required to be convex, formulating a convex program for their computation and illustrating their stability with respect to distributional shifts. Taken together, our results provide both theoretical and computational foundations for designing regularizers that are reliable under model uncertainty and structurally constrained for robust deployment.

Paper Structure

This paper contains 35 sections, 19 theorems, 118 equations, 6 figures.

Key Result

Theorem 2.1

For star bodies $K,L \in \mathcal{S}^d$, we have and equality holds if and only if $K$ and $L$ are dilates, i.e., there exists an $\alpha > 0$ such that $K = \alpha L$.

Figures (6)

  • Figure 1: Distributionally robust optimal regularizer for data supported on standard basis vectors. The choice of $\epsilon$, from left to right, is $0.01$, $0.1$, $0.2$, $0.3$, with the cost given by the absolute distance.
  • Figure 2: Distributionally robust optimal regularizer for data supported on standard basis vectors. The choice of $\epsilon$, from left to right, is $0.01$, $0.1$, $1.0$, $10$, and the cost function is the $\ell_2^2$ distance.
  • Figure 3: Illustration of how convexity for planar sets is enforced: The sum of the areas of the sectors in the red triangles in left sub-figure should exceed the area of the sector in the middle sub-figure. Right sub-figure: When the inequality is violated, the resulting set is no longer convex.
  • Figure 4: Convex planar set expressed via a union of triangle sectors. The volume of this set is expressed as the sum of the areas of each sector.
  • Figure 5: Optimal Convex Regularizers. The underlying data distribution is given in the top row and the (level set of the) corresponding optimal convex regularizer is specified in the bottom row.
  • ...and 1 more figures

Theorems & Definitions (37)

  • Theorem 2.1: Special Case of Theorem 2 in lutwak1975dual
  • Theorem 3.1: Theorem 3 in leong2025optimal
  • Theorem 3.2
  • proof : Proof of Theorem \ref{['thm:dro_formulation']}
  • Proposition 3.3
  • proof : Proof of Proposition \ref{['prop:lambda-phi-characterization']}
  • Proposition 3.4
  • proof : Proof of Proposition \ref{['prop:lipschitz-regularization']}
  • Remark
  • Proposition 3.5
  • ...and 27 more