Table of Contents
Fetching ...

No-Free-Lunch Theories for Tensor-Network Machine Learning Models

Jing-Chuan Wu, Qi Ye, Dong-Ling Deng, Li-Wei Yu

TL;DR

This work establishes rigorous no-free-lunch theorems for tensor-network based machine learning models by deriving analytical lower bounds on the average generalization risk for learning arbitrary target unitaries using TN inputs. It treats both 1D matrix product states (MPS) and 2D projected entangled-pair states (PEPS), leveraging unitary 2-design properties to map moment calculations to classical partition-function problems and employing a novel polyomino-based combinatorial method for the 2D case. The main results show the average risk bounds depend not only on training-set size but also on intrinsic TN factors such as bond and physical dimensions, providing quantitative limits on TN-based learning performance and guiding model design. Numerical simulations corroborate the analytical bounds, and the framework opens avenues for future analytical and experimental exploration of quantum-inspired TN learning under realistic hardware conditions.

Abstract

Tensor network machine learning models have shown remarkable versatility in tackling complex data-driven tasks, ranging from quantum many-body problems to classical pattern recognitions. Despite their promising performance, a comprehensive understanding of the underlying assumptions and limitations of these models is still lacking. In this work, we focus on the rigorous formulation of their no-free-lunch theorem -- essential yet notoriously challenging to formalize for specific tensor network machine learning models. In particular, we rigorously analyze the generalization risks of learning target output functions from input data encoded in tensor network states. We first prove a no-free-lunch theorem for machine learning models based on matrix product states, i.e., the one-dimensional tensor network states. Furthermore, we circumvent the challenging issue of calculating the partition function for two-dimensional Ising model, and prove the no-free-lunch theorem for the case of two-dimensional projected entangled-pair state, by introducing the combinatorial method associated to the "puzzle of polyominoes". Our findings reveal the intrinsic limitations of tensor network-based learning models in a rigorous fashion, and open up an avenue for future analytical exploration of both the strengths and limitations of quantum-inspired machine learning frameworks.

No-Free-Lunch Theories for Tensor-Network Machine Learning Models

TL;DR

This work establishes rigorous no-free-lunch theorems for tensor-network based machine learning models by deriving analytical lower bounds on the average generalization risk for learning arbitrary target unitaries using TN inputs. It treats both 1D matrix product states (MPS) and 2D projected entangled-pair states (PEPS), leveraging unitary 2-design properties to map moment calculations to classical partition-function problems and employing a novel polyomino-based combinatorial method for the 2D case. The main results show the average risk bounds depend not only on training-set size but also on intrinsic TN factors such as bond and physical dimensions, providing quantitative limits on TN-based learning performance and guiding model design. Numerical simulations corroborate the analytical bounds, and the framework opens avenues for future analytical and experimental exploration of quantum-inspired TN learning under realistic hardware conditions.

Abstract

Tensor network machine learning models have shown remarkable versatility in tackling complex data-driven tasks, ranging from quantum many-body problems to classical pattern recognitions. Despite their promising performance, a comprehensive understanding of the underlying assumptions and limitations of these models is still lacking. In this work, we focus on the rigorous formulation of their no-free-lunch theorem -- essential yet notoriously challenging to formalize for specific tensor network machine learning models. In particular, we rigorously analyze the generalization risks of learning target output functions from input data encoded in tensor network states. We first prove a no-free-lunch theorem for machine learning models based on matrix product states, i.e., the one-dimensional tensor network states. Furthermore, we circumvent the challenging issue of calculating the partition function for two-dimensional Ising model, and prove the no-free-lunch theorem for the case of two-dimensional projected entangled-pair state, by introducing the combinatorial method associated to the "puzzle of polyominoes". Our findings reveal the intrinsic limitations of tensor network-based learning models in a rigorous fashion, and open up an avenue for future analytical exploration of both the strengths and limitations of quantum-inspired machine learning frameworks.

Paper Structure

This paper contains 10 sections, 7 theorems, 94 equations, 7 figures.

Key Result

Theorem 1

Define the risk function $R_M(P_{\mathcal{S}})$ in Eq. (Risk_error) for learning a target $n$-qubit unitary $M$ based on the input of MPSs, where $P_{\mathcal{S}}$ represents the hypothesis unitary learned from the training set $\mathcal{S}$. Given a linear independent training set with size $t_k = where $A=\frac{D+1}{Dd+1}$, $B=\frac{D-1}{Dd-1}$, and $D$ is the bond dimension of MPS.

Figures (7)

  • Figure 1: Schematic illustration of supervised learning unitaries based on tensor network states. Upper panel: The encoding strategy. Classical (Quantum) data samples with labels are encoded into the local tensors $U^{(i)}$ of the unitary embedded tensor network states. Lower panel: The learning strategy. Given the training set $\mathcal{S}$ of samples with the labeled outputs (left), the goal is to minimize the average distance between the learned output states and the ground truth states (acting the unitary $M^\dag$ on encoding states) over all training samples. Unitary circuit $P_{\mathcal{S}}$ stores the variational parameters.
  • Figure 2: Average risk of the trained MPS-based machine learning models with respect to the training set size $t_k = 2^n-2^{n-k}$, where the system qubit size $n$ varies from four to five. The physical dimension $d=2$, and the bond dimension $D=2$. The solid lines represent the analytical lower bounds of the average risk predicted by Theorem 1. And the dotted lines denote the average risk of the trained MPS-based machine learning models for predicting target unitaries.
  • Figure S1: Illustrations of the general polyomino and the directed polyomino. (a) Illustration of the various types of general polyominoes with the area ranging from 1 to 4. Here we regard the different rotations of a polyomino as being of the same type. (b) Illustration of a directed polyomino rooted at the gray site with area m = 6, perimeter p = 14, and upper perimeter n = 3 (depicted as the black lines).
  • Figure S2: Directed graphs consist of $\uparrow$. Solid and hollow points separately represent the sites of $\uparrow$ and $\downarrow$, while the sites in grey square represents the root of the graph. (a) An example of an ESS which have only one root. (b) An example of a cycle ESS which have more than one root, where all sites on cycle are its roots.
  • Figure S3: Illustrations of the zone $A$ of trained sites and configurations outside A. (a) The $t$ training samples are arranged at $k$ sites labeled by $i=1,2,...,k$ in counterclockwise order, which forms a region whose upper and left boundary both no more than $l$. The sites within the dashed box indicate the candidate sites for the roots of ESSs out of A. (b) Illustration of the contribution of the i-th ESS rooted at the left boundary of A and the upper boundary of A, the two configurations are equivalent under diagonal reflection. (c) Examples of two categories of configurations distinguished by whether they contain cycle ESSs. These are different from the configurations in (b) which only contain an ESS rooted at the sites adjacent to the boundary of A.
  • ...and 2 more figures

Theorems & Definitions (14)

  • Theorem 1
  • proof
  • Theorem 2
  • proof
  • proof
  • Theorem 1: NFL theorem for MPS
  • Theorem 2: NFL theorem for PEPS
  • Definition 1
  • Definition 2
  • Lemma 1
  • ...and 4 more