Table of Contents
Fetching ...

EM algorithms for optimization problems with polynomial objectives

Kensuke Asai, Jun-ya Gotoh

TL;DR

This paper expands the EM/MM viewpoint to general optimization by casting objective functions as $f(\bm{\theta}) = -\ln \mathbb{E}_{p_{\bm{x}}(\bm{\theta})}[G(\bm{X})]$ and deriving EM schemes that produce monotone improvements. For exponential-family distributions, it shows that EM updates can be interpreted as natural-gradient steps with fixed step sizes, enabling efficient optimization of polynomial objectives over polyhedral regions. The authors present three concrete examples—unconstrained quadratic, a polynomial over a rectangle, and a polynomial over a simplex—yielding explicit update rules and showing global convergence under suitable conditions. They further extend the framework to broader problems including polytopes, quadratic and linear programs, and discuss GEM variants, connecting information geometry insights with proximal/interior-point concepts. The work lays a foundation for a versatile, distribution-based approach to non-statistical optimization with polynomial objectives and polyhedral constraints.

Abstract

The EM (Expectation-Maximization) algorithm is regarded as an MM (Majorization-Minimization) algorithm for maximum likelihood estimation of statistical models. Expanding this view, this paper demonstrates that by choosing an appropriate probability distribution, even nonstatistical optimization problem can be cast as a negative log-likelihood-like minimization problem, which can be approached by an EM (or MM) algorithm. When a polynomial objective is optimized over a simple polyhedral feasible set and an exponential family distribution is employed, the EM algorithm can be reduced to a natural gradient descent of the employed distribution with a constant step size. This is demonstrated through three examples. In this paper, we demonstrate the global convergence of specific cases with some exponential family distributions in a general form. In instances when the feasible set is not sufficiently simple, the use of MM algorithms can nevertheless be adequately described. When the objective is to minimize a convex quadratic function and the constraints are polyhedral, global convergence can also be established based on the existing results for an entropy-like proximal point algorithm.

EM algorithms for optimization problems with polynomial objectives

TL;DR

This paper expands the EM/MM viewpoint to general optimization by casting objective functions as and deriving EM schemes that produce monotone improvements. For exponential-family distributions, it shows that EM updates can be interpreted as natural-gradient steps with fixed step sizes, enabling efficient optimization of polynomial objectives over polyhedral regions. The authors present three concrete examples—unconstrained quadratic, a polynomial over a rectangle, and a polynomial over a simplex—yielding explicit update rules and showing global convergence under suitable conditions. They further extend the framework to broader problems including polytopes, quadratic and linear programs, and discuss GEM variants, connecting information geometry insights with proximal/interior-point concepts. The work lays a foundation for a versatile, distribution-based approach to non-statistical optimization with polynomial objectives and polyhedral constraints.

Abstract

The EM (Expectation-Maximization) algorithm is regarded as an MM (Majorization-Minimization) algorithm for maximum likelihood estimation of statistical models. Expanding this view, this paper demonstrates that by choosing an appropriate probability distribution, even nonstatistical optimization problem can be cast as a negative log-likelihood-like minimization problem, which can be approached by an EM (or MM) algorithm. When a polynomial objective is optimized over a simple polyhedral feasible set and an exponential family distribution is employed, the EM algorithm can be reduced to a natural gradient descent of the employed distribution with a constant step size. This is demonstrated through three examples. In this paper, we demonstrate the global convergence of specific cases with some exponential family distributions in a general form. In instances when the feasible set is not sufficiently simple, the use of MM algorithms can nevertheless be adequately described. When the objective is to minimize a convex quadratic function and the constraints are polyhedral, global convergence can also be established based on the existing results for an entropy-like proximal point algorithm.
Paper Structure (39 sections, 16 theorems, 112 equations, 1 table, 8 algorithms)

This paper contains 39 sections, 16 theorems, 112 equations, 1 table, 8 algorithms.

Key Result

Proposition 3.1

Suppose that any probability distribution $p_{\bm{x}}(\bm{\theta})$ of the exponential family eq:expdist is used for Algorithm alg:prototypeEM and let $\{\bm{\theta}^{(t)}\}_{t\geq 0}$ denote the sequence of solutions generated by the algorithm. Then we have the following statements. (a) The probabi where (b) The E-step and M-step (lines 3 to 4) of Algorithm alg:prototypeEM can be merged into the

Theorems & Definitions (19)

  • Remark 2.1
  • Proposition 3.1
  • Corollary 3.2
  • Theorem 3.3
  • Remark 3.4
  • Proposition 4.1
  • Corollary 4.2
  • Corollary 4.3
  • Proposition 4.4
  • Corollary 4.5
  • ...and 9 more