Generalization Bounds of Surrogate Policies for Combinatorial Optimization Problems

Pierre-Cyril Aubin-Frankowski; Yohann De Castro; Axel Parmentier; Alessandro Rudi

Generalization Bounds of Surrogate Policies for Combinatorial Optimization Problems

Pierre-Cyril Aubin-Frankowski, Yohann De Castro, Axel Parmentier, Alessandro Rudi

TL;DR

The paper tackles generalization of surrogate policies for combinatorial optimization by introducing smoothing via Gaussian perturbations to a linear-optimization surrogate, enabling differentiable risk and gradient-based learning. It develops a decomposition of the excess risk into perturbation bias, estimation error, and optimization error, and introduces the Uniform Weak (UW) moment property to quantify how the statistic model interacts with the normal-cone structure of the feasible polytope. The theory shows that, under mild conditions and with positive regularization $\varepsilon_0>0$, the UW property holds and the excess risk can be controlled, with explicit bias-variance-approximation tradeoffs characterized as functions of the perturbation scale $\lambda$, sample size $n$, and optimization complexity $M$ (via Kernel-SoS). The framework applies to contextual stochastic optimization and industrially relevant problems like stochastic vehicle scheduling, where smoothing enables tractable training and controlled generalization, while maintaining a tractable inference via the linear oracle. Overall, the work provides non-asymptotic guarantees and a principled guidance for choosing perturbation levels to balance training efficiency and generalization.

Abstract

A recent line of structured learning methods has advanced the practical state-of-the-art for combinatorial optimization problems with complex, application-specific objectives. These approaches learn policies that couple a statistical model with a tractable surrogate combinatorial optimization oracle, so as to exploit the distribution of problem instances instead of solving each instance independently. A core obstacle is that the empirical risk is then piecewise constant in the model parameters. This hinders gradient-based optimization and only few theoretical guarantees have been provided so far. We address this issue by analyzing smoothed (perturbed) policies: adding controlled random perturbations to the direction used by the linear oracle yields a differentiable surrogate risk and improves generalization. Our main contribution is a generalization bound that decomposes the excess risk into perturbation bias, statistical estimation error, and optimization error. The analysis hinges on a new Uniform Weak (UW) property capturing the geometric interaction between the statistical model and the normal fan of the feasible polytope; we show it holds under mild assumptions, and automatically when a minimal baseline perturbation is present. The framework covers, in particular, contextual stochastic optimization. We illustrate the scope of the results on applications such as stochastic vehicle scheduling, highlighting how smoothing enables both tractable training and controlled generalization.

Generalization Bounds of Surrogate Policies for Combinatorial Optimization Problems

TL;DR

, the UW property holds and the excess risk can be controlled, with explicit bias-variance-approximation tradeoffs characterized as functions of the perturbation scale

, sample size

, and optimization complexity

(via Kernel-SoS). The framework applies to contextual stochastic optimization and industrially relevant problems like stochastic vehicle scheduling, where smoothing enables tractable training and controlled generalization, while maintaining a tractable inference via the linear oracle. Overall, the work provides non-asymptotic guarantees and a principled guidance for choosing perturbation levels to balance training efficiency and generalization.

Abstract

Paper Structure (36 sections, 16 theorems, 95 equations, 3 figures)

This paper contains 36 sections, 16 theorems, 95 equations, 3 figures.

Introduction
Learning policies instead of minimizing separately
Discussion on the methodological choices
Why consider a learning problem?
Why not approximate $f^0$ directly?
Why consider a linear oracle in \ref{['eq:ylinearProblem']}?
Why not assume access to a training set?
Why focus on the risk on $\mathcal{H}_\mathcal{W}$ rather than on any measurable $h$?
Is there other frameworks for bounding the optimization error?
Related works
Relevant applications
Learning algorithms and generalization bounds.
Contributions
Outline and notation
The surrogate policy model and its guarantees
...and 21 more sections

Key Result

Theorem 1

Under the conditions given in Section sec:conditions, the following holds true. Let $\varepsilon_0\geq 0$ and $\lambda>0$ be such that $\lambda\geq \varepsilon_0$. Let $\tau\in(0,1)$. There exists a constant $C>0$ that depends only on $\varepsilon_0$, $\tau$ and $f^0$ such that for any ${\bm{w}}\in\ where $s> {d_\mathcal{W}}/2$ is some tuning parameter on the order of regularity of the admissible

Figures (3)

Figure 1: Illustration of the stochastic vehicle scheduling policy.
Figure 2: Surrogate policy encoded by the statistical model ${\psi}_{\bm{w}}\,:\,{\bm{x}}\in\mathcal{X} \mapsto \bm\theta\in\mathds{R}^{d({\bm{x}})}$ with combinatorial optimization (CO) layer given by a linear program over solutions ${\bm{y}}\in\mathcal{Y}({\bm{x}})$.
Figure 3: Normal cone at point ${\bm{y}}_1$ to the polytope (left) and normal fan with internal radius $\rho$ at point $\bm\theta$ (right).

Theorems & Definitions (46)

Example 1
Example 2
Example 3
Definition 1: Surrogate policy
Remark 2: Generic case and an abuse of notation
Definition 3: Law of the perturbation
Remark 4: On forthcoming Gaussian assumption
Remark 5: Link with the internal radius
Definition 6: Perturbed surrogate policy probabilities
Remark 7
...and 36 more

Generalization Bounds of Surrogate Policies for Combinatorial Optimization Problems

TL;DR

Abstract

Generalization Bounds of Surrogate Policies for Combinatorial Optimization Problems

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (46)