Table of Contents
Fetching ...

A Tensor Residual Circuit Neural Network Factorized with Matrix Product Operation

Andi Chen

TL;DR

This paper tackles the challenge of reducing neural network complexity without sacrificing generalization and robustness. It introduces a Tensor Circuit Neural Network (TCNN) that blends Matrix Product Operator (MPO) factorization with a residually connected, complex-valued circuit architecture, augmented by an information fusion layer that merges real and imaginary features. The approach achieves competitive parameter efficiency while delivering improved generalization and robustness on standard image datasets, including resilience to noise and parameter attacks, and is supported by ablation studies validating the architecture. The work suggests TCNN as a practical, hardware-friendly option for robust, efficient image recognition in noisy industrial settings.

Abstract

It is challenging to reduce the complexity of neural networks while maintaining their generalization ability and robustness, especially for practical applications. Conventional solutions for this problem incorporate quantum-inspired neural networks with Kronecker products and hybrid tensor neural networks with MPO factorization and fully-connected layers. Nonetheless, the generalization power and robustness of the fully-connected layers are not as outstanding as circuit models in quantum computing. In this paper, we propose a novel tensor circuit neural network (TCNN) that takes advantage of the characteristics of tensor neural networks and residual circuit models to achieve generalization ability and robustness with low complexity. The proposed activation operation and parallelism of the circuit in complex number field improves its non-linearity and efficiency for feature learning. Moreover, since the feature information exists in the parameters in both the real and imaginary parts in TCNN, an information fusion layer is proposed for merging features stored in those parameters to enhance the generalization capability. Experimental results confirm that TCNN showcases more outstanding generalization and robustness with its average accuracies on various datasets 2\%-3\% higher than those of the state-of-the-art compared models. More significantly, while other models fail to learn features under noise parameter attacking, TCNN still showcases prominent learning capability owing to its ability to prevent gradient explosion. Furthermore, it is comparable to the compared models on the number of trainable parameters and the CPU running time. An ablation study also indicates the advantage of the activation operation, the parallelism architecture and the information fusion layer.

A Tensor Residual Circuit Neural Network Factorized with Matrix Product Operation

TL;DR

This paper tackles the challenge of reducing neural network complexity without sacrificing generalization and robustness. It introduces a Tensor Circuit Neural Network (TCNN) that blends Matrix Product Operator (MPO) factorization with a residually connected, complex-valued circuit architecture, augmented by an information fusion layer that merges real and imaginary features. The approach achieves competitive parameter efficiency while delivering improved generalization and robustness on standard image datasets, including resilience to noise and parameter attacks, and is supported by ablation studies validating the architecture. The work suggests TCNN as a practical, hardware-friendly option for robust, efficient image recognition in noisy industrial settings.

Abstract

It is challenging to reduce the complexity of neural networks while maintaining their generalization ability and robustness, especially for practical applications. Conventional solutions for this problem incorporate quantum-inspired neural networks with Kronecker products and hybrid tensor neural networks with MPO factorization and fully-connected layers. Nonetheless, the generalization power and robustness of the fully-connected layers are not as outstanding as circuit models in quantum computing. In this paper, we propose a novel tensor circuit neural network (TCNN) that takes advantage of the characteristics of tensor neural networks and residual circuit models to achieve generalization ability and robustness with low complexity. The proposed activation operation and parallelism of the circuit in complex number field improves its non-linearity and efficiency for feature learning. Moreover, since the feature information exists in the parameters in both the real and imaginary parts in TCNN, an information fusion layer is proposed for merging features stored in those parameters to enhance the generalization capability. Experimental results confirm that TCNN showcases more outstanding generalization and robustness with its average accuracies on various datasets 2\%-3\% higher than those of the state-of-the-art compared models. More significantly, while other models fail to learn features under noise parameter attacking, TCNN still showcases prominent learning capability owing to its ability to prevent gradient explosion. Furthermore, it is comparable to the compared models on the number of trainable parameters and the CPU running time. An ablation study also indicates the advantage of the activation operation, the parallelism architecture and the information fusion layer.

Paper Structure

This paper contains 6 sections, 20 equations, 6 figures, 7 tables.

Figures (6)

  • Figure 1: Residual connections. As for the proposed residual circuit model in this paper, the residual mapping is continuous unitary transformation with activations.
  • Figure 2: Architecture of TCNN and the process of transferring data into TCNN. TCNN incorporates one tensor neural network architecture with two tensor parts and paralleled circuit framework.
  • Figure 3: One layer of the residual circuit models with activations.$O_k$ and $O_{k+1}$ separately denote the input and output of the $k^{th}$ circuit layer. $\sigma$ represents any activation function. $\phi^k_r$ and $\theta^k_r$ represent trainable parameters in the real and imaginary parts. They are updated by back propagation method.
  • Figure 4: Process of merging information of the real and imaginary parts of the paralleled circuit in the information fusion layer.
  • Figure 5: Accuracy curves of the four models on clean MNIST data. All the curves fluctuate first, and then reach convergence. The accuracy of TCNN is the highest.
  • ...and 1 more figures