Theory and applications of the Sum-Of-Squares technique
Francis Bach, Elisabetta Cornacchia, Luca Pesce, Giovanni Piccioli
TL;DR
This work surveys the Sum-of-Squares (SOS) framework for turning nonconvex global optimization into tractable semidefinite programs by enforcing nonnegativity through SOS representations. It extends SOS to infinite-dimensional settings via reproducing-kernel methods (k-SOS) and the Representer Theorem, enabling practical relaxations that scale with sample size through subsampling and kernel matrices. The notes then connect SOS to information theory, revealing how the log-partition function and KL-type divergences can be bounded and estimated using kernel-based moment matrices and SDP relaxations. Collectively, the framework provides principled, operator- and kernel-based strategies to bound and approximate challenging problems in optimization and information theory with provable surrogate guarantees. The approach has practical impact for domains requiring tractable bounds on nonconvex objectives, including control, learning, and probabilistic inference, where kernelized SOS relaxations offer scalable, data-driven tools.
Abstract
The Sum-of-Squares (SOS) approximation method is a technique used in optimization problems to derive lower bounds on the optimal value of an objective function. By representing the objective function as a sum of squares in a feature space, the SOS method transforms non-convex global optimization problems into solvable semidefinite programs. This note presents an overview of the SOS method. We start with its application in finite-dimensional feature spaces and, subsequently, we extend it to infinite-dimensional feature spaces using reproducing kernels (k-SOS). Additionally, we highlight the utilization of SOS for estimating some relevant quantities in information theory, including the log-partition function.
