Table of Contents
Fetching ...

Mixed Newton Method for Optimization in Complex Spaces

Nikita Yudin, Roland Hildebrand, Sergey Bakhurin, Alexander Degtyarev, Anna Lisachenko, Ilya Kuruzov, Andrei Semenov, Mohammad Alkousa

TL;DR

The recently introduced Mixed Newton Method is modified and applied to the minimization of real-valued functions of real variables by extending the functions to complex space, and it is shown that arbitrary regularizations preserve the favorable local convergence properties of the method.

Abstract

In this paper, we modify and apply the recently introduced Mixed Newton Method, which is originally designed for minimizing real-valued functions of complex variables, to the minimization of real-valued functions of real variables by extending the functions to complex space. We show that arbitrary regularizations preserve the favorable local convergence properties of the method, and construct a special type of regularization used to prevent convergence to complex minima. We compare several variants of the method applied to training neural networks with real and complex parameters.

Mixed Newton Method for Optimization in Complex Spaces

TL;DR

The recently introduced Mixed Newton Method is modified and applied to the minimization of real-valued functions of real variables by extending the functions to complex space, and it is shown that arbitrary regularizations preserve the favorable local convergence properties of the method.

Abstract

In this paper, we modify and apply the recently introduced Mixed Newton Method, which is originally designed for minimizing real-valued functions of complex variables, to the minimization of real-valued functions of real variables by extending the functions to complex space. We show that arbitrary regularizations preserve the favorable local convergence properties of the method, and construct a special type of regularization used to prevent convergence to complex minima. We compare several variants of the method applied to training neural networks with real and complex parameters.
Paper Structure (10 sections, 2 theorems, 45 equations, 7 figures, 6 tables, 1 algorithm)

This paper contains 10 sections, 2 theorems, 45 equations, 7 figures, 6 tables, 1 algorithm.

Key Result

Theorem 1

The mixed derivative used in the MNM can be computed by the formula and is hence positive semi-definite. Let $\hat{z}$ be a critical point of the function $f$, i.e., $\frac{\partial f(\hat{z})}{\partial\bar{z}} = 0_n$. In the non-degenerate case (when the full Hessian $\frac{\partial^2f(\hat{z})}{\partial(z,\bar{z})^2}$ is invertible) the iterates behave for $z$ nea

Figures (7)

  • Figure 1: Learning curves for the tested optimization algorithms. CV-CNN model. Initial parameters in the complex plane.
  • Figure 2: Learning curves for the tested optimization algorithms. CV-CNN model. Initial parameters on the real axis.
  • Figure 3: Learning curves for the tested optimization algorithms. CV-CNN model. Initial parameters on the imaginary axis.
  • Figure 4: Learning curves for the tested optimization algorithms. RV-CNN model.
  • Figure 5: Comparison of learning curves for RV-CNN trained by LM-NM against CV-CNN trained by LM-MNM and CMNM.
  • ...and 2 more figures

Theorems & Definitions (6)

  • Theorem 1
  • Theorem 2
  • Example 1
  • Example 2
  • Example 3
  • Remark 1