Table of Contents
Fetching ...

Efficient Generative Modeling with Unitary Matrix Product States Using Riemannian Optimization

Haotong Duan, Zhongming Chen, Ngai Wong

Abstract

Tensor networks, which are originally developed for characterizing complex quantum many-body systems, have recently emerged as a powerful framework for capturing high-dimensional probability distributions with strong physical interpretability. This paper systematically studies matrix product states (MPS) for generative modeling and shows that unitary MPS, which is a tensor-network architecture that is both simple and expressive, offers clear benefits for unsupervised learning by reducing ambiguity in parameter updates and improving efficiency. To overcome the inefficiency of standard gradient-based MPS training, we develop a Riemannian optimization approach that casts probabilistic modeling as an optimization problem with manifold constraints, and further derive an efficient space-decoupling algorithm. Experiments on Bars-and-Stripes and EMNIST datasets demonstrate fast adaptation to data structure, stable updates, and strong performance while maintaining the efficiency and expressive power of MPS.

Efficient Generative Modeling with Unitary Matrix Product States Using Riemannian Optimization

Abstract

Tensor networks, which are originally developed for characterizing complex quantum many-body systems, have recently emerged as a powerful framework for capturing high-dimensional probability distributions with strong physical interpretability. This paper systematically studies matrix product states (MPS) for generative modeling and shows that unitary MPS, which is a tensor-network architecture that is both simple and expressive, offers clear benefits for unsupervised learning by reducing ambiguity in parameter updates and improving efficiency. To overcome the inefficiency of standard gradient-based MPS training, we develop a Riemannian optimization approach that casts probabilistic modeling as an optimization problem with manifold constraints, and further derive an efficient space-decoupling algorithm. Experiments on Bars-and-Stripes and EMNIST datasets demonstrate fast adaptation to data structure, stable updates, and strong performance while maintaining the efficiency and expressive power of MPS.
Paper Structure (19 sections, 4 theorems, 56 equations, 9 figures, 4 tables, 1 algorithm)

This paper contains 19 sections, 4 theorems, 56 equations, 9 figures, 4 tables, 1 algorithm.

Key Result

Proposition 1

For any tensor $\mathcal{A} \in \mathbb{R}^{n_1 \times n_2 \times \cdots \times n_d}$, there always exist TT-cores $A^{(k)} \in \mathbb{R}^{r_{k-1} \times n_k \times r_k}$ ($k=1,2,\dots,d$) satisfying (eq:TT_general) with TT-ranks

Figures (9)

  • Figure 1: Optimization flowchart of the space-decoupling method, where dashed lines indicate the optimization process.
  • Figure 2: Schematic of sequential sample generation from a UMPS. All tensors except $A^{(d)}$ are gauged to be left-canonical, Each tensor $A^{(i)}$ generates one bit $v_i$ conditionally on the bits to its right.
  • Figure 3: (a) Illustration of column-major flattening of a $16\times16$ image into a 256-dimensional vector $v = (p_1, p_2, \dots, p_{16},p_{17}, \dots , p_{256})^\top$. (b) A subset of the Bars-and-Stripes dataset, with images of size $16 \times 16$.
  • Figure 4: (a) The blue curve represents the average NLL as a function of the number of loops, while the orange bars indicate the cumulative computation time required to reach the corresponding loops. (b) Images generated by the model at loops from 1 to 5 as indicated in (a). In experiments above, we set $r_{\rm max}=500$, $|\mathcal{T}|=400$, and the learning rate to $0.007$.
  • Figure 5: (a) Average NLL as a function of the dataset size under different $r_{\rm max}$. For each curve, the circular markers from left to right correspond to the NLL values obtained at $|\mathcal{T}| = 100, 200, 300, 400,$ and $500$, respectively. (b) The average computation time per experiment shown in (a).
  • ...and 4 more figures

Theorems & Definitions (12)

  • Definition 1: Frobenius norm
  • Definition 2: Outer product
  • Definition 3: Tensor train decomposition
  • Definition 4: $k$-unfolding liu2022tensor
  • Proposition 1: Theorem 2.1 of oseledets2011tensor
  • Definition 5: Left- and right-canonical
  • Proposition 2: Mixed-canonical form of MPS oseledets2011tensor
  • Definition 6: Tangent space
  • Definition 7: Riemannian gradient
  • Proposition 3: Computation of Riemannian gradient absil2008optimization
  • ...and 2 more