Table of Contents
Fetching ...

A practical, fast method for solving sum-of-squares problems for very large polynomials

Daniel Keren, Margarita Osadchy, Roi Poranne

Abstract

Sum of squares (SOS) optimization is a powerful technique for solving problems where the positivity of a polynomials must be enforced. The common approach to solve an SOS problem is by relaxation to a Semidefinite Program (SDP). The main advantage of this transormation is that SDP is a convex problem for which efficient solvers are readily available. However, while considerable progress has been made in recent years, the standard approaches for solving SDPs are still known to scale poorly. Our goal is to devise an approach that can handle larger, more complex problems than is currently possible. The challenge indeed lies in how SDPs are commonly solved. State-Of-The-Art approaches rely on the interior point method, which requires the factorization of large matrices. We instead propose an approach inspired by polynomial neural networks, which exhibit excellent performance when optimized using techniques from the deep learning toolbox. In a somewhat counter-intuitive manner, we replace the convex SDP formulation with a non-convex, unconstrained, and \emph{over parameterized} formulation, and solve it using a first order optimization method. It turns out that this approach can handle very large problems, with polynomials having over four million coefficients, well beyond the range of current SDP-based approaches. Furthermore, we highlight theoretical and practical results supporting the experimental success of our approach in avoiding spurious local minima, which makes it amenable to simple and fast solutions based on gradient descent. In all the experiments, our approach had always converged to a correct global minimum, on general (non-sparse) polynomials, with running time only slightly higher than linear in the number of polynomial coefficients, compared to higher than quadratic in the number of coefficients for SDP-based methods.

A practical, fast method for solving sum-of-squares problems for very large polynomials

Abstract

Sum of squares (SOS) optimization is a powerful technique for solving problems where the positivity of a polynomials must be enforced. The common approach to solve an SOS problem is by relaxation to a Semidefinite Program (SDP). The main advantage of this transormation is that SDP is a convex problem for which efficient solvers are readily available. However, while considerable progress has been made in recent years, the standard approaches for solving SDPs are still known to scale poorly. Our goal is to devise an approach that can handle larger, more complex problems than is currently possible. The challenge indeed lies in how SDPs are commonly solved. State-Of-The-Art approaches rely on the interior point method, which requires the factorization of large matrices. We instead propose an approach inspired by polynomial neural networks, which exhibit excellent performance when optimized using techniques from the deep learning toolbox. In a somewhat counter-intuitive manner, we replace the convex SDP formulation with a non-convex, unconstrained, and \emph{over parameterized} formulation, and solve it using a first order optimization method. It turns out that this approach can handle very large problems, with polynomials having over four million coefficients, well beyond the range of current SDP-based approaches. Furthermore, we highlight theoretical and practical results supporting the experimental success of our approach in avoiding spurious local minima, which makes it amenable to simple and fast solutions based on gradient descent. In all the experiments, our approach had always converged to a correct global minimum, on general (non-sparse) polynomials, with running time only slightly higher than linear in the number of polynomial coefficients, compared to higher than quadratic in the number of coefficients for SDP-based methods.

Paper Structure

This paper contains 12 sections, 15 equations, 5 figures.

Figures (5)

  • Figure 1: A schematic diagram of the network for generating quartic polynomials in $n$ variables. Note that it symbolic, i.e its output is not numeric, but a symbolic polynomial. The network weights are encoded in the matrices $A(m \times n),B(k \times nm)$, where $m \geq n$.
  • Figure 2: Top left: log of the error (vertical) vs. iteration number (horizontal), for $B$ with rank 110 (the upper bound on the Pythagoras number, Section \ref{['sec:over']}), vs. (top right) $B$ with rank 465 (the number of second degree monomials with 30 variables). Bottom: convergence for 100 variables. Note that the number of required iterations is on the same order of magnitude as for 30 variables.
  • Figure 3: Left: error vs. number of iterations with $A=I$ and one coefficient set to $10^4$. Right: same, for general $A$.
  • Figure 4: Top left: convergence in the case of many perturbed coefficients, $A=I$. Top right: same, with general $A$. Bottom: same, with a larger $A$ ($2n \times n$).
  • Figure 5: Convergence for a polynomials with many small random coefficients.

Theorems & Definitions (1)

  • Definition 1