Table of Contents
Fetching ...

Penalty decomposition derivative free method for the minimization of partially separable functions over a convex feasible set

Francesco Cecere, Matteo Lapucci, Davide Pucci, Marco Sciandrone

TL;DR

The paper addresses the problem of minimizing a finite-sum objective $f(x)=\sum_{j=1}^m f_j(x)$ over a convex set $X$ when first-order information is unavailable. It introduces a penalty-decomposition framework that reformulates the problem with auxiliary variables and a penalty term $P_\tau$, coupled with a derivative-free line-search to solve the resulting subproblems; a derivative-free alternating minimization (DFAM) handles the inner steps, with a stopping rule that guarantees compatibility with the outer penalty scheme. The authors prove global convergence to stationary points under mild coercivity and Lipschitz conditions, and tailor the approach to coordinate partially separable (CPS) structures to enable parallel computation. Computational experiments on CPS-structured problems show competitive performance against derivative-free coordinate descent and mesh-adaptive methods, with clear gains from parallelization, highlighting the method's suitability for large-scale, structured black-box optimization with partial gradient access.

Abstract

In this paper, we consider the problem of minimizing a smooth function, given as finite sum of black-box functions, over a convex set. In order to advantageously exploit the structure of the problem, for instance when the terms of the objective functions are partially separable, noisy, costly or with first-order information partially accessible, we propose a framework where the penalty decomposition approach is combined with a derivative-free line-search-based method. Under standard assumptions, we state theoretical results showing that the proposed algorithm is well-defined and globally convergent to stationary points. The results of preliminary numerical experiments, performed on test problems with number of variables up to thousands, show the validity of the proposed method compared with a standard derivative-free line-search algorithm. Moreover, it is shown that the method is easily parallelizable and hence capable of taking advantage of parallelization of computation, when possible.

Penalty decomposition derivative free method for the minimization of partially separable functions over a convex feasible set

TL;DR

The paper addresses the problem of minimizing a finite-sum objective over a convex set when first-order information is unavailable. It introduces a penalty-decomposition framework that reformulates the problem with auxiliary variables and a penalty term , coupled with a derivative-free line-search to solve the resulting subproblems; a derivative-free alternating minimization (DFAM) handles the inner steps, with a stopping rule that guarantees compatibility with the outer penalty scheme. The authors prove global convergence to stationary points under mild coercivity and Lipschitz conditions, and tailor the approach to coordinate partially separable (CPS) structures to enable parallel computation. Computational experiments on CPS-structured problems show competitive performance against derivative-free coordinate descent and mesh-adaptive methods, with clear gains from parallelization, highlighting the method's suitability for large-scale, structured black-box optimization with partial gradient access.

Abstract

In this paper, we consider the problem of minimizing a smooth function, given as finite sum of black-box functions, over a convex set. In order to advantageously exploit the structure of the problem, for instance when the terms of the objective functions are partially separable, noisy, costly or with first-order information partially accessible, we propose a framework where the penalty decomposition approach is combined with a derivative-free line-search-based method. Under standard assumptions, we state theoretical results showing that the proposed algorithm is well-defined and globally convergent to stationary points. The results of preliminary numerical experiments, performed on test problems with number of variables up to thousands, show the validity of the proposed method compared with a standard derivative-free line-search algorithm. Moreover, it is shown that the method is easily parallelizable and hence capable of taking advantage of parallelization of computation, when possible.

Paper Structure

This paper contains 9 sections, 7 theorems, 22 equations, 3 figures, 2 tables, 3 algorithms.

Key Result

Proposition 2.1

For any $\tau>0$, the penalty function $P_\tau(x,y,z)$ is coercive, i.e., $P_\tau(x^t,y^t,z^t)\to+\infty$ if $\|(x^t,y^t,z^t)\|\to\infty$.

Figures (3)

  • Figure 1: Data profiles and performance profiles in the problems derived from CUTEst.
  • Figure 2: Performance profile of the wall-clock time require do reach the termination condition in the problems derived from CUTEst.
  • Figure 3: Evolution of the objective function value with respect to the total number of sub-function evaluations for the compared optimization algorithms in the two versions of the Lockwood problem considered.

Theorems & Definitions (9)

  • Remark 1
  • Proposition 2.1
  • Corollary 2.2
  • Proposition 2.3
  • Proposition 2.4
  • Remark 2
  • Proposition 3.1
  • Proposition 3.2
  • Proposition 3.3