Structured Nonsmooth Optimization Using Functional Encoding and Branching Information

Fengqiao Luo

Structured Nonsmooth Optimization Using Functional Encoding and Branching Information

Fengqiao Luo

TL;DR

The paper tackles nonsmooth nonconvex optimization where nonsmoothness arises from explicit operators by introducing functional encoding of active branches and a branch-informed gradient method (BIGD). By maintaining a disc of discovered branches and solving a joint-gradient quadratic program, BIGD achieves Clarke-stationary convergence with enhanced efficiency, and EBIGD further delivers local linear convergence under structural conditions. Numerical experiments on standard test problems show substantial reductions in function/gradient evaluations and quadratic-programmable subproblem complexity, outperforming several existing nonsmooth methods in higher dimensions. The approach offers a versatile way to inject branch-structure information into a wide range of nonsmooth optimization algorithms, potentially extending to stochastic settings and other problem classes.

Abstract

We develop a novel gradient-based algorithm for optimizing nonsmooth nonconvex functions where nonsmoothness arises from explicit nonsmooth operators in the objective's analytical form. Our key innovation involves encoding active smooth branches of these operators, enabling both branch function extraction at arbitrary points and transition detection through branch tracking. This approach yields a Branch-Information-Driven Gradient Descent (BIGD) method for encodable piecewise-differentiable functions, with an enhanced version achieving local linear convergence under appropriate conditions. The computationally efficient encoding mechanism is straightforward to implement. The power of using branch information has been proved via substantial numerical experiments compared to some existing nonsmooth optimization methods on standard test problems. Most importantly, for piecewise-smooth problems given analytical expressions, implementation of functional encoding can be integrated into a wide range of existing nonsmooth optimization methods to improve the bundle points management, reduce the complexity of the quadratic programming sub-problems, and improve the efficiency of line search.

Structured Nonsmooth Optimization Using Functional Encoding and Branching Information

TL;DR

Abstract

Paper Structure (9 sections, 12 theorems, 62 equations, 5 figures, 6 tables, 3 algorithms)

This paper contains 9 sections, 12 theorems, 62 equations, 5 figures, 6 tables, 3 algorithms.

Introduction
Motivation
Review of related works
Encoding of piecewise-differentiable functions
A gradient descent method with branch information
An enhanced algorithm and analysis of the convergence rate
Numerical investigation
Conclusion
Auxiliary results

Key Result

Proposition 2.1

For a piecewise-differentiable function $f$, the Clarke differential has the following simplified representation:

Figures (5)

Figure 1: Non-smooth operator branching scheme of the function in Example \ref{['exp:branching']}.
Figure 2: A rule-based piecewise-differentiable function that is encodable.
Figure 3: An illustration of Algorithm \ref{['alg:trust-region-nonconvex']}: Figure (a) shows that there are four branches $\theta_1$, $\theta_2$, $\theta_3$ and $\theta_4$ near a local minimizer $O$. Their representative points are $z_{\theta_1}$, $z_{\theta_2}$, $z_{\theta_3}$ and $z_{\theta_4}$, respectively, which are visited by the algorithm. Suppose $x=z_{\theta_1}$ is the current point, and branches $\theta_1$ and $\theta_2$ are selected to compute the descent direction. After the line search along this direction (red arrow), the point moves from $z_{\theta_1}$ to a point in $\widetilde{\mathcal{D}}_{\theta_4}$. Figure (b) shows that the representative point of $\theta_4$ has been updated after the move.
Figure 4: Objective value versus computational time for 8 problems with $n=200$ and random initial points
Figure 5: Number of effective branches and visited branches for 7 problems with $n=200$ and random initial points

Theorems & Definitions (33)

Definition 2.1
Definition 2.2: encodablility
Definition 2.3: Clarke differential
Example 2.1
Example 2.2: haarala2004-mem-bundle-nonsmooth-opt
Example 2.3
Proposition 2.1
proof
Corollary 2.1
Proposition 3.1
...and 23 more

Structured Nonsmooth Optimization Using Functional Encoding and Branching Information

TL;DR

Abstract

Structured Nonsmooth Optimization Using Functional Encoding and Branching Information

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (33)