Table of Contents
Fetching ...

Universal Representation of Generalized Convex Functions and their Gradients

Moeen Nehzati

TL;DR

The paper develops a differentiable, convex-parameter-space parameterization for generalized convex functions (GCFs) and their gradients, proving universal approximation properties for both functions and gradients under mild regularity. It provides a neural-network-inspired interpretation via finitely $Y$-convex constructions with max-aggregation, along with smoothing via log-sum-exp to ensure differentiability. The framework enables converting bilevel or min-max problems in optimal transport with general costs and multi-item auctions into single-level problems solvable via first-order methods. Experiments on optimal transport and auction design validate the approach and demonstrate practical performance, with an open-source implementation released for reproducibility. This work lays a foundation for structure-aware, gradient-friendly representations of GCFs with potential for deeper finitely convex architectures in economics and optimization.

Abstract

A wide range of optimization problems can often be written in terms of generalized convex functions (GCFs). When this structure is present, it can convert certain nested bilevel objectives into single-level problems amenable to standard first-order optimization methods. We provide a new differentiable layer with a convex parameter space and show (Theorems 5.1 and 5.2) that it and its gradient are universal approximators for GCFs and their gradients. We demonstrate how this parameterization can be leveraged in practice by (i) learning optimal transport maps with general cost functions and (ii) learning optimal auctions of multiple goods. In both these cases, we show how our layer can be used to convert the existing bilevel or min-max formulations into single-level problems that can be solved efficiently with first-order methods.

Universal Representation of Generalized Convex Functions and their Gradients

TL;DR

The paper develops a differentiable, convex-parameter-space parameterization for generalized convex functions (GCFs) and their gradients, proving universal approximation properties for both functions and gradients under mild regularity. It provides a neural-network-inspired interpretation via finitely -convex constructions with max-aggregation, along with smoothing via log-sum-exp to ensure differentiability. The framework enables converting bilevel or min-max problems in optimal transport with general costs and multi-item auctions into single-level problems solvable via first-order methods. Experiments on optimal transport and auction design validate the approach and demonstrate practical performance, with an open-source implementation released for reproducibility. This work lays a foundation for structure-aware, gradient-friendly representations of GCFs with potential for deeper finitely convex architectures in economics and optimization.

Abstract

A wide range of optimization problems can often be written in terms of generalized convex functions (GCFs). When this structure is present, it can convert certain nested bilevel objectives into single-level problems amenable to standard first-order optimization methods. We provide a new differentiable layer with a convex parameter space and show (Theorems 5.1 and 5.2) that it and its gradient are universal approximators for GCFs and their gradients. We demonstrate how this parameterization can be leveraged in practice by (i) learning optimal transport maps with general cost functions and (ii) learning optimal auctions of multiple goods. In both these cases, we show how our layer can be used to convert the existing bilevel or min-max formulations into single-level problems that can be solved efficiently with first-order methods.

Paper Structure

This paper contains 31 sections, 7 theorems, 37 equations, 3 figures, 2 tables.

Key Result

Proposition 5.1

Given any $\epsilon > 0$, there is a finite $\tilde{Y} \subseteq Y$ such that for any $Y$-convex function $f \in \mathcal{C}^Y(X)$, there exists $g \in \mathcal{C}^{\tilde{Y}}(X)$ such that

Figures (3)

  • Figure 1: Comparison between finitely convex functions and neural networks: The left side shows a shallow neural network with an $n$-dimensional input layer, a single $m$-dimensional hidden layer, and a one-dimensional output layer. The right side shows a $\tilde{Y}$-convex function $r^{\tilde{Y}}$ with $\tilde{Y} = \{y^1, y^2, \dots, y^m\}$.
  • Figure 2: Visualization of the transport maps. The left plot shows the samples from the marginals. The middle and right plot show the underlying marginals and $T(X)$ where $T$ is the optimal learned transport map for $||x-y||_2^2$ and $-||x-y||_2^2$ respectively. The black lines connect $X$ and $T(X)$. As expected, the lines in the middle plot are short while lines in the right plot are as long as possible.
  • Figure 3: Auction learned for the two-item case for setting (A). The allocation is neither pure bundling nor separate selling. Similar to SJa, it prices combinations of items and exhibits bunching.

Theorems & Definitions (10)

  • Proposition 5.1: Uniform approximation of $Y$-convex functions
  • Proposition 5.2
  • Theorem 5.1: Density of finitely $Y$-convex functions
  • Definition 5.1: Semiconvexity
  • Proposition 5.3: Stability of gradients under semiconvex convergence
  • Proposition 5.4: Preservation of semiconvexity under $\tilde{Y}$-transform
  • Theorem 5.2: Universal approximation for gradients
  • Theorem 5.3: Smooth Approximation
  • Remark 5.1
  • Remark A.1