Table of Contents
Fetching ...

Scalable Determination of Penalization Weights for Constrained Optimizations on Approximate Solvers

Edoardo Alessandroni, Sergi Ramos-Calderer, Michel Krispin, Fritz Schinkel, Stefan Walter, Martin Kliesch, Leandro Aolita, Ingo Roth

Abstract

Quadratic unconstrained binary optimization (QUBO) provides problem formulations for various computational problems that can be solved with dedicated QUBO solvers, which can be based on classical or quantum computation. A common approach to constrained combinatorial optimization problems is to enforce the constraints in the QUBO formulation by adding penalization terms. Penalization introduces an additional hyperparameter that significantly affects the solver's efficacy: the relative weight between the objective terms and the penalization terms. We develop a pre-computation strategy for determining penalization weights with provable guarantees for Gibbs solvers and polynomial complexity for broad problem classes. Experiments across diverse problems and solver architectures, including large-scale instances on Fujitsu's Digital Annealer, show robust performance and order-of-magnitude speedups over existing heuristics.

Scalable Determination of Penalization Weights for Constrained Optimizations on Approximate Solvers

Abstract

Quadratic unconstrained binary optimization (QUBO) provides problem formulations for various computational problems that can be solved with dedicated QUBO solvers, which can be based on classical or quantum computation. A common approach to constrained combinatorial optimization problems is to enforce the constraints in the QUBO formulation by adding penalization terms. Penalization introduces an additional hyperparameter that significantly affects the solver's efficacy: the relative weight between the objective terms and the penalization terms. We develop a pre-computation strategy for determining penalization weights with provable guarantees for Gibbs solvers and polynomial complexity for broad problem classes. Experiments across diverse problems and solver architectures, including large-scale instances on Fujitsu's Digital Annealer, show robust performance and order-of-magnitude speedups over existing heuristics.

Paper Structure

This paper contains 18 sections, 4 theorems, 46 equations, 8 figures, 1 algorithm.

Key Result

Theorem 2

In the limit $N_s \to \infty$ and for $v_\mathrm{cut} = \max_{\mathbf{x}} E^{(p)}(\mathbf{x})$ the following holds: If Algorithm alg:M returns $M^\ast \neq \{\}$, then QUBO with $M = M^\ast$ is an $\eta$-reformulation with guaranteed energy threshold $E_f$ for a Gibbs sampler at inverse temperature

Figures (8)

  • Figure 1: The big-$M$ problem for approximate solvers is to ensure by the choice of a penalization weight $M$ that an approximate solver samples feasible solutions with probability at least $\eta$ when given a QUBO reformulation of a constrained optimization problem. Optionally, one can additionally enforce the solutions to be below a certain energy threshold $E_f$. We assume that the output distribution of a solver is qualitatively approximated by a Gibbs distribution at known inverse temperature, illustrated to the left in terms of the probability $p(e)$ of sampling a solution with energy $e$ conditioned on the solution being feasible or infeasible. The density of infeasible solutions is naturally grouped into families ('humps'), each one characterized by $E^{(p)}(x)$ taking a certain value. Our method to determine $M$, summarized to the right, calculates (1.) a lower bound $B^{\hbox{<}}_{\!\mathcal{F}}$ and (2.) an upper bound $B^{\hbox{>}}_{\!\mathcal{F}}$ on the probabilities of sampling feasible points with objective below and exceeding $E_f$, respectively. Together with (3.) an upper bound $B_{ \!\bar{\mathcal{F}}}$ on the probability of infeasible events, (4.) the penalization weight $M$ is determined as the unique root of the scalar function $g(M)$, that depends on the targeted success probability $\eta$. We argue that this method is efficient for large classes of problems, prove theoretical guarantees on its performance, and demonstrate its practical applicability numerically.
  • Figure 2: Proportion of feasible solutions observed $\eta_\mathrm{eff}$(top) and mean objective energy $E^{(o)}$ of sampled feasible solutions (bottom) on the DA solver (version 3) for different benchmarked problems (from left to right: MNPP, TSP instances from library TSPbenchmarks, and TSP with cities placed on a circle) with different values of $M$ and problem size. The gray areas lack mean energy points because only infeasible bitstrings were sampled for those values of $M$. In the top panels, we observe a phase transition from infeasible to feasible solutions as $M$ increases, indicating that selecting optimized values of $M$ is needed. However, on the bottom panels, we notice a degradation in the quality of the sampled bitstrings. The mean of the energy $E^{(o)}$ of the outputs of the sampler increases for larger values of $M$, beyond a seemingly sweet spot that is located around the transition. For reference, the $M$ values suggested by the trivial choice in \ref{['eq:M_l1']} are several orders of magnitude larger than the shown scales: around $10^9$ for MNPP, $10^8$ for benchmarks TSP and $10^{10}$ for circle TSP. Such extreme overshooting implies that the mean energy sampled by the solver would lie far from the desired minimum, undermining the optimization's effectiveness.
  • Figure 3: Effective success probability $\eta_\mathrm{eff}$ of an ideal Gibbs sampler (top rows) and SA (bottom rows) for sampling feasible points, using a QUBO reformulation with penalization weight $M^\ast$ calculated by Alg. \ref{['alg:M']} for different constrained optimization problems (colums, see \ref{['app:benchmark_problems']} for details) and system sizes. For each solver, in the first row we only require feasibilty ($E_f=\infty$), while for the second row we further require solutions with objective smaller than a finite, problem-dependent $E_f$. Different colors denote target success probabilities $\eta \in {0.25,0.5,0.75}$, with horizontal reference lines at these values. Marker shapes indicate sampler temperature $T=\beta^{-1}$. For SA, temperatures are obtained by rescaling Digital Annealer schedules as $T=\phi T_{\mathrm{DA}}$, with $\phi\in{1,10,100}$; for PO, schedules are approximated using instances of the same size from other benchmarks. Solid lines and markers show averages over $100$ (ideal Gibbs sampler) or $4$ (SA) instances, with shaded standard deviation; circle TSP and benchmark TSP, only define a single instance per system size. Per instance, $10^3$ (ideal Gibbs) or $128$ (SA) samples are drawn. PO uses $10^5$. We generally observe that $\eta_\text{\rm eff}$ is larger than $\eta$ showing that Alg. \ref{['alg:M']} yields admissible $\eta$-reformulations. For finite $E_f$, some combinations of $T$, $\eta$, and $E_f$ make the target $\eta$ unattainable (see \ref{['app:eta_exist']}); in these cases, $\eta$ is reduced. Such instances are indicated in the optimality-focused MNPP panels (bottom left) by short horizontal bars marking the reduced target below the achieved $\eta_\mathrm{eff}$.
  • Figure 4: Effective success probability $\eta_\mathrm{eff}$ of the Fujitsu Digital Annealer (version 3) as a function of the system size using penalizations weigths $M^\ast$ determined by Alg. \ref{['alg:M']}. Structure of the figure is identical to \ref{['fig:eta_eff_GSA']} and refer to its captions for details. The temperatures of the annealing process here have been automatically selected internally in the Digital Annealer. For MNPP, solid lines and markers are averages over $4$ instances, with shaded standard deviation. For TSP and circle TSP only one instance was considered per system size. For each instance, $512$ solutions were sampled for all problems.
  • Figure 5: Multiplicative speedup compared to binary search from direct bound, computed as $\log_2( M_{\ell_1} / M^*)$, as a function of the system size for the different benchmarked problems. Here $M^*$ is the output of Alg. \ref{['alg:M']} and $M_{\ell_1}$ is a direct bound for $M$ (see \ref{['lem:Ml1']}). Colors indicate different target probabilities $\eta_r \in \{0.25, 0.5, 0.75\}$, and marker shapes different temperatures $T = \beta^{-1}$. For MNPP, random TSP and PO, lines and markers are averages over $10$ instances and the standard deviation is shaded. Circle TSP defines one instance per system size. As shown in \ref{['fig:bigM_prob']}, directly using $M_{\ell_1}$ substantially degrades the solution quality. A binary search reducing $M_{\ell_1}$ to $M^\ast$ requires iterations $\log_2( M_{\ell_1} / M^*)$ with repeated calss to the QUBO solver. Thus, the advantage of using $M^*$ is a reduction in overhead proportional to this factor.
  • ...and 3 more figures

Theorems & Definitions (9)

  • Definition 1
  • Theorem 2
  • proof
  • Lemma 3
  • proof
  • Theorem 4
  • proof
  • Lemma 5
  • proof