A fast algorithm to minimize prediction loss of the optimal solution in inverse optimization problem of MILP

Akira Kitaoka

A fast algorithm to minimize prediction loss of the optimal solution in inverse optimization problem of MILP

Akira Kitaoka

TL;DR

A projected subgradient method with a step size of $k^{-1/2}$ based on suboptimality loss that efficiently learns the weights of the objective function and solves the inverse optimization problems of MILP using fewer than 1/7 the number of MILP calls required by known methods.

Abstract

We consider the inverse optimization problem of estimating the weights of the objective function such that the given solution is an optimal solution for a mixed integer linear program (MILP). In this inverse optimization problem, the known methods exhibit inefficient convergence. Specifically, if $d$ denotes the dimension of the weights and $k$ the number of iterations, then the error of the weights is bounded by $O(k^{-1/(d-1)})$, leading to slow convergence as $d$ increases. We propose a projected subgradient method with a step size of $k^{-1/2}$ based on suboptimality loss. We theoretically show and demonstrate that the proposed method efficiently learns the weights. In particular, we show that there exists a constant $γ> 0$ such that the distance between the learned and true weights is bounded by $ O\left(k^{-1/(1+γ)} \exp\left(-\frac{γk^{1/2}}{2+γ}\right)\right), $ or the optimal solution is exactly recovered. Furthermore, experiments demonstrate that the proposed method solves the inverse optimization problems of MILP using fewer than $1/7$ the number of MILP calls required by known methods, and converges within a finite number of iterations.

A fast algorithm to minimize prediction loss of the optimal solution in inverse optimization problem of MILP

TL;DR

A projected subgradient method with a step size of

based on suboptimality loss that efficiently learns the weights of the objective function and solves the inverse optimization problems of MILP using fewer than 1/7 the number of MILP calls required by known methods.

Abstract

denotes the dimension of the weights and

the number of iterations, then the error of the weights is bounded by

, leading to slow convergence as

increases. We propose a projected subgradient method with a step size of

based on suboptimality loss. We theoretically show and demonstrate that the proposed method efficiently learns the weights. In particular, we show that there exists a constant

such that the distance between the learned and true weights is bounded by

or the optimal solution is exactly recovered. Furthermore, experiments demonstrate that the proposed method solves the inverse optimization problems of MILP using fewer than

the number of MILP calls required by known methods, and converges within a finite number of iterations.

Paper Structure (34 sections, 16 theorems, 105 equations, 5 figures, 2 tables, 1 algorithm)

This paper contains 34 sections, 16 theorems, 105 equations, 5 figures, 2 tables, 1 algorithm.

Introduction
Related work
Case where the objective function is strongly convex and $L$-smooth and the constraints are convex
Method to smooth the prediction loss of the optimal solution
Wasserstein inverse reinforcement learning
Preparation
Algorithm to minimize the suboptimality loss
Convergence analysis of the algorithm
Main theorems
Proof of Theorem \ref{['theo:intention_learning_complete_Polyhedral']}
Proof of Lemma \ref{['lem:Psi_set_is_almost_Phi']}
Proof of Lemma \ref{['lemma:intention_learning_complete_Psi']}
Proof of Lemma \ref{['lem:SPO_bound']}
Proof of Theorem \ref{['theo:intention_learning_universal_complete_Polyhedral']}
Experiment
...and 19 more sections

Key Result

Proposition 4.1

(Barmann-2018-online*Proposition 3.1, Kitaoka-2023-convergence-IRL) We assume that assu:WIRL. Then, we have the following:

Figures (5)

Figure 1: Behavior of $\ell_{\mathrm{pres}} (\phi_k (\mathcal{D})) +0.1$ in the worst case and LPs
Figure 2: Behavior of $\ell_{\mathrm{pres}} (\phi_k (\mathcal{D})) +0.1$ in the worst case and machine schedulings
Figure 3: Behavior of $\ell_{\mathrm{sub}} (\phi_k (\mathcal{D})) +0.001$ in the worst case and LPs
Figure 4: Behavior of $\ell_{\mathrm{sub}} (\phi_k (\mathcal{D})) +0.001$ in the worst case and machine schedulings.
Figure : Minimization of suboptimality loss (PSGD)Kitaoka-2023-convergence-IRL*Algorithm 1

Theorems & Definitions (42)

Proposition 4.1
Remark 4.2
Proposition 5.1
Example 5.3
Remark 5.4
Theorem 5.5
Remark 5.6
Remark 5.7
Theorem 5.8
Remark 5.9
...and 32 more

A fast algorithm to minimize prediction loss of the optimal solution in inverse optimization problem of MILP

TL;DR

Abstract

A fast algorithm to minimize prediction loss of the optimal solution in inverse optimization problem of MILP

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (42)