Table of Contents
Fetching ...

Tensor Networks Meet Neural Networks: A Survey and Future Perspectives

Maolin Wang, Yu Pan, Zenglin Xu, Guangxi Li, Xiangli Yang, Danilo Mandic, Andrzej Cichocki

TL;DR

This survey articulates a unifying framework called tensorial neural networks (TNNs) that merges tensor networks (TNs) with neural networks (NNs) to address data efficiency and model compression. It systematically surveys data-processing capabilities (multi-source fusion, multimodal pooling, data compression, multitask learning, quantum data representations) and model architectures (TCNNs, tensorized RNNs, tensorial Transformers, TGNNs, tensorial QNNs, LLM integration), along with training strategies and toolboxes. The work highlights how TNs reduce the curse of dimensionality, enable compact representations, and offer pathways for sustainable AI, while also discussing training stability, rank selection, hardware accelerators, and quantum applications. It further outlines challenges and future directions, including hardware-aware contractions, MERA-based architectures, and integration with large language models, pointing to impactful applications in quantum simulation and scalable AI systems.

Abstract

Tensor networks (TNs) and neural networks (NNs) are two fundamental data modeling approaches. TNs were introduced to solve the curse of dimensionality in large-scale tensors by converting an exponential number of dimensions to polynomial complexity. As a result, they have attracted significant attention in the fields of quantum physics and machine learning. Meanwhile, NNs have displayed exceptional performance in various applications, e.g., computer vision, natural language processing, and robotics research. Interestingly, although these two types of networks originate from different observations, they are inherently linked through the typical multilinearity structure underlying both TNs and NNs, thereby motivating a significant number of developments regarding combinations of TNs and NNs. In this paper, we refer to these combinations as tensorial neural networks~(TNNs) and present an introduction to TNNs from both data processing and model architecture perspectives. From the data perspective, we explore the capabilities of TNNs in multi-source fusion, multimodal pooling, data compression, multi-task training, and quantum data processing. From the model perspective, we examine TNNs' integration with various architectures, including Convolutional Neural Networks, Recurrent Neural Networks, Graph Neural Networks, Transformers, Large Language Models, and Quantum Neural Networks. Furthermore, this survey also explores methods for improving TNNs, examines flexible toolboxes for implementing TNNs, and documents TNN development while highlighting potential future directions. To the best of our knowledge, this is the first comprehensive survey that bridges the connections among NNs and TNs. We provide a curated list of TNNs at https://github.com/tnbar/awesome-tensorial-neural-networks.

Tensor Networks Meet Neural Networks: A Survey and Future Perspectives

TL;DR

This survey articulates a unifying framework called tensorial neural networks (TNNs) that merges tensor networks (TNs) with neural networks (NNs) to address data efficiency and model compression. It systematically surveys data-processing capabilities (multi-source fusion, multimodal pooling, data compression, multitask learning, quantum data representations) and model architectures (TCNNs, tensorized RNNs, tensorial Transformers, TGNNs, tensorial QNNs, LLM integration), along with training strategies and toolboxes. The work highlights how TNs reduce the curse of dimensionality, enable compact representations, and offer pathways for sustainable AI, while also discussing training stability, rank selection, hardware accelerators, and quantum applications. It further outlines challenges and future directions, including hardware-aware contractions, MERA-based architectures, and integration with large language models, pointing to impactful applications in quantum simulation and scalable AI systems.

Abstract

Tensor networks (TNs) and neural networks (NNs) are two fundamental data modeling approaches. TNs were introduced to solve the curse of dimensionality in large-scale tensors by converting an exponential number of dimensions to polynomial complexity. As a result, they have attracted significant attention in the fields of quantum physics and machine learning. Meanwhile, NNs have displayed exceptional performance in various applications, e.g., computer vision, natural language processing, and robotics research. Interestingly, although these two types of networks originate from different observations, they are inherently linked through the typical multilinearity structure underlying both TNs and NNs, thereby motivating a significant number of developments regarding combinations of TNs and NNs. In this paper, we refer to these combinations as tensorial neural networks~(TNNs) and present an introduction to TNNs from both data processing and model architecture perspectives. From the data perspective, we explore the capabilities of TNNs in multi-source fusion, multimodal pooling, data compression, multi-task training, and quantum data processing. From the model perspective, we examine TNNs' integration with various architectures, including Convolutional Neural Networks, Recurrent Neural Networks, Graph Neural Networks, Transformers, Large Language Models, and Quantum Neural Networks. Furthermore, this survey also explores methods for improving TNNs, examines flexible toolboxes for implementing TNNs, and documents TNN development while highlighting potential future directions. To the best of our knowledge, this is the first comprehensive survey that bridges the connections among NNs and TNs. We provide a curated list of TNNs at https://github.com/tnbar/awesome-tensorial-neural-networks.
Paper Structure (49 sections, 35 equations, 13 figures, 3 tables)

This paper contains 49 sections, 35 equations, 13 figures, 3 tables.

Figures (13)

  • Figure 1: Basic symbols for TN diagrams. For more details about TNs, refer to biamonte2017tensor and cichocki2016tensor.
  • Figure 2: TN diagrams of some popular TN decompositions. (a) The CP format decomposes a tensor ${\bm{\mathcal{X}}}$ into a sum of several rank-1 tensors $\bm a^{(1)}_{:,r}\circ \bm a^{(2)}_{:,r}\circ\cdots \circ \bm a^{(N)}_{:,r}$. (b) Tucker decomposition decomposes a tensor $\bm{\mathcal{X}}$ into a core tensor $\bm{\mathcal{G}}$ multiplied by a matrix $\bm{A}^{(n)}$ along the $n$th mode. (c) Block term decomposition decomposes a tensor ${\bm{\mathcal{X}}}$ into a sum of several Tucker decompositions (on the right) with low Tucker ranks. (d) TT decomposition decomposes a tensor ${\bm{\mathcal{X}}}$ into a linear multiplication of a set of 3rd-order core tensors $\bm{\mathcal{G}}^{(2)}\cdots\bm{\mathcal{G}}^{(N-1)}$ and two matrices $\bm{\mathcal{G}}^{(1)},\quad\bm{\mathcal{G}}^{(N)}$. (e) TR decomposition decomposes a tensor ${\bm{\mathcal{X}}}$ into a set of 3rd-order core tensors and contracts them into a ring structure. (f) HT decomposition represents a tensor ${\bm{\mathcal{X}}}$ as a tree-like diagram. For more basic knowledge about TNs, we refer to biamonte2017tensor and cichocki2016tensor. (g) Tensor Grid Decomposition (a.k.a. PEPS) represents a high-dimensional tensor as a two-dimensional grid of interconnected lower-rank tensors, where each node connects to its neighbors to efficiently capture spatial correlations in systems with local interactions.
  • Figure 3: Correspondence between TN diagrams and convolutional procedures. In each subfigure, the left part is a TN diagram, and the right part is the associated commonly used feature representation.
  • Figure 4: Illustration of the tensor fusion process in Eq. \ref{['eq:tensorfusion']}. Different from a TN diagram, each circle corresponds to a value.
  • Figure 5: Illustration of polynomial tensor pooling (PTP) hou2019deep. PTP first concatenates all feature vectors $\mathbf{z}_1,\mathbf{z}_2,\mathbf{z}_3$ into a longer feature vector $\mathbf{z}_{123}^{\top}=\left[1, \mathbf{z}_{1}^{\top}, \mathbf{z}_{2}^{\top}, \mathbf{z}_{3}^{\top}\right]$, it then derives a polynomial feature tensor by repeatedly performing outer product operations on the feature vector $\mathbf{z}_{123}$, and finally adopts a tensorial layer (e.g., a TR layer) to merge the polynomial feature tensor into a vector $\mathbf{h}$.
  • ...and 8 more figures