Table of Contents
Fetching ...

Regularized second-order optimization of tensor-network Born machines

Matan Ben-Dov, Jing Chen

TL;DR

An improved second-order optimization technique for TNBM training is presented, which significantly enhances convergence rates and the quality of the optimized model, and employs a modified Newton's method on the manifold of normalized states to mitigate local minima issues.

Abstract

Tensor-network Born machines (TNBMs) are quantum-inspired generative models for learning data distributions. Using tensor-network contraction and optimization techniques, the model learns an efficient representation of the target distribution, capable of capturing complex correlations with a compact parameterization. Despite their promise, the optimization of TNBMs presents several challenges. A key bottleneck of TNBMs is the logarithmic nature of the loss function commonly used for this problem. The single-tensor logarithmic optimization problem cannot be solved analytically, necessitating an iterative approach that slows down convergence and increases the risk of getting trapped in one of many non-optimal local minima. In this paper, we present an improved second-order optimization technique for TNBM training, which significantly enhances convergence rates and the quality of the optimized model. Our method employs a modified Newton's method on the manifold of normalized states, incorporating regularization of the loss landscape to mitigate local minima issues. We demonstrate the effectiveness of our approach by training a one-dimensional matrix product state (MPS) on both discrete and continuous datasets, showcasing its advantages in terms of stability and efficiency, and demonstrating its potential as a robust and scalable approach for optimizing quantum-inspired generative models.

Regularized second-order optimization of tensor-network Born machines

TL;DR

An improved second-order optimization technique for TNBM training is presented, which significantly enhances convergence rates and the quality of the optimized model, and employs a modified Newton's method on the manifold of normalized states to mitigate local minima issues.

Abstract

Tensor-network Born machines (TNBMs) are quantum-inspired generative models for learning data distributions. Using tensor-network contraction and optimization techniques, the model learns an efficient representation of the target distribution, capable of capturing complex correlations with a compact parameterization. Despite their promise, the optimization of TNBMs presents several challenges. A key bottleneck of TNBMs is the logarithmic nature of the loss function commonly used for this problem. The single-tensor logarithmic optimization problem cannot be solved analytically, necessitating an iterative approach that slows down convergence and increases the risk of getting trapped in one of many non-optimal local minima. In this paper, we present an improved second-order optimization technique for TNBM training, which significantly enhances convergence rates and the quality of the optimized model. Our method employs a modified Newton's method on the manifold of normalized states, incorporating regularization of the loss landscape to mitigate local minima issues. We demonstrate the effectiveness of our approach by training a one-dimensional matrix product state (MPS) on both discrete and continuous datasets, showcasing its advantages in terms of stability and efficiency, and demonstrating its potential as a robust and scalable approach for optimizing quantum-inspired generative models.

Paper Structure

This paper contains 14 sections, 25 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Illustration of the matrix product state (MPS) and its application to TNBMs. Panel (a) shows the probability of a sample, represented as the square of the overlap between the binary bit-string of the sample and the MPS. The example illustrates a 5-site MPS. Panel (b) depicts the contraction of the MPS into left-environment ($L$) and right-environment ($R$) tensors. These environment tensors, along with the bitstring sample, are further contracted to form the full environment tensor, as shown in panel (c). The full environment tensor plays a crucial role in the formulation of the gradient with respect to a single tensor ($T_2$).
  • Figure 2: An illustration of the effect of regularization on the NLL loss landscapes and its relation to sample overlaps. In panel (a), the overlaps between the MPS and different samples are plotted as a function of step size along a specified direction. For small step sizes, the overlap shifts approximately linearly. In panel (b), the NLL loss landscapes are shown for varying regularization constants, plotted against the same step sizes. The loss curves are shifted vertically to align their values at the zero point for clarity. Without regularization ($\epsilon = 0$), the loss landscape exhibits barriers due to vanishing overlaps, creating challenges for optimization. The introduction of regularization smooths the landscape by introducing a cutoff, dissolving these barriers and enabling the optimizer to converge to the global minimum.
  • Figure 3: An illustration of the regularization as an imaginary shift in the complex plane. In panel (a), the negative logarithm function is extended to the complex plane, accounting for overlaps with complex phases. By introducing a constant imaginary shift to the overlap, the singularity can be avoided. In panel (b), we plot 5 cross sections for different imaginary shifts, effectively smoothing-out the singularity at the zero overlap point.
  • Figure 4: A comparison of loss curves for training of TNBM on the bars and stripes (BAS) and the MNIST datasets using different optimization algorithms. We compare the regularized Newton method (green) against the steepest descent (blue) and the vanilla Newton's method (red). The figure presents the loss as a function of the number of iterations, while the bands represent the standard deviation of the loss curves between five randomly initialized realizations. The bottom row presents a zoom-in view on the region of lower loss values reached after a single forward and backward sweep.
  • Figure 5: Loss curves of continuous embedding MPS for various optimization methods: simple gradient-descent, Newton's method optimization and the two variants of regularized newton optimization on manifold, the smoothing regularization in Eq. \ref{['eq:reg']} and bias regularization in in Eq. \ref{['eq:reg_bias']}.