Mixed Newton Method for Optimization in Complex Spaces

Nikita Yudin; Roland Hildebrand; Sergey Bakhurin; Alexander Degtyarev; Anna Lisachenko; Ilya Kuruzov; Andrei Semenov; Mohammad Alkousa

Mixed Newton Method for Optimization in Complex Spaces

Nikita Yudin, Roland Hildebrand, Sergey Bakhurin, Alexander Degtyarev, Anna Lisachenko, Ilya Kuruzov, Andrei Semenov, Mohammad Alkousa

TL;DR

The recently introduced Mixed Newton Method is modified and applied to the minimization of real-valued functions of real variables by extending the functions to complex space, and it is shown that arbitrary regularizations preserve the favorable local convergence properties of the method.

Abstract

In this paper, we modify and apply the recently introduced Mixed Newton Method, which is originally designed for minimizing real-valued functions of complex variables, to the minimization of real-valued functions of real variables by extending the functions to complex space. We show that arbitrary regularizations preserve the favorable local convergence properties of the method, and construct a special type of regularization used to prevent convergence to complex minima. We compare several variants of the method applied to training neural networks with real and complex parameters.

Mixed Newton Method for Optimization in Complex Spaces

TL;DR

Abstract

Paper Structure (10 sections, 2 theorems, 45 equations, 7 figures, 6 tables, 1 algorithm)

This paper contains 10 sections, 2 theorems, 45 equations, 7 figures, 6 tables, 1 algorithm.

Introduction
Mixed Newton method with regularization
Minimization of real analytic functions
Numerical experiments
Conclusions
Appendix
Applications
Derivation of the iterations of LM-MNM and CMNM
Digital Pre-Distortion
Simulations for the Abalone task

Key Result

Theorem 1

The mixed derivative used in the MNM can be computed by the formula and is hence positive semi-definite. Let $\hat{z}$ be a critical point of the function $f$, i.e., $\frac{\partial f(\hat{z})}{\partial\bar{z}} = 0_n$. In the non-degenerate case (when the full Hessian $\frac{\partial^2f(\hat{z})}{\partial(z,\bar{z})^2}$ is invertible) the iterates behave for $z$ nea

Figures (7)

Figure 1: Learning curves for the tested optimization algorithms. CV-CNN model. Initial parameters in the complex plane.
Figure 2: Learning curves for the tested optimization algorithms. CV-CNN model. Initial parameters on the real axis.
Figure 3: Learning curves for the tested optimization algorithms. CV-CNN model. Initial parameters on the imaginary axis.
Figure 4: Learning curves for the tested optimization algorithms. RV-CNN model.
Figure 5: Comparison of learning curves for RV-CNN trained by LM-NM against CV-CNN trained by LM-MNM and CMNM.
...and 2 more figures

Theorems & Definitions (6)

Theorem 1
Theorem 2
Example 1
Example 2
Example 3
Remark 1

Mixed Newton Method for Optimization in Complex Spaces

TL;DR

Abstract

Mixed Newton Method for Optimization in Complex Spaces

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (6)