Table of Contents
Fetching ...

Tetra-AML: Automatic Machine Learning via Tensor Networks

A. Naumov, Ar. Melnikov, V. Abronin, F. Oxanichenko, K. Izmailov, M. Pflitsch, A. Melnikov, M. Perelshtein

TL;DR

This work proposes and evaluates the TetraOpt algorithm against various optimization algorithms for hyperparameter optimization and introduces a novel iterative method that combines CP, SVD, and Tucker tensor decompositions for model compression.

Abstract

Neural networks have revolutionized many aspects of society but in the era of huge models with billions of parameters, optimizing and deploying them for commercial applications can require significant computational and financial resources. To address these challenges, we introduce the Tetra-AML toolbox, which automates neural architecture search and hyperparameter optimization via a custom-developed black-box Tensor train Optimization algorithm, TetraOpt. The toolbox also provides model compression through quantization and pruning, augmented by compression using tensor networks. Here, we analyze a unified benchmark for optimizing neural networks in computer vision tasks and show the superior performance of our approach compared to Bayesian optimization on the CIFAR-10 dataset. We also demonstrate the compression of ResNet-18 neural networks, where we use 14.5 times less memory while losing just 3.2% of accuracy. The presented framework is generic, not limited by computer vision problems, supports hardware acceleration (such as with GPUs and TPUs) and can be further extended to quantum hardware and to hybrid quantum machine learning models.

Tetra-AML: Automatic Machine Learning via Tensor Networks

TL;DR

This work proposes and evaluates the TetraOpt algorithm against various optimization algorithms for hyperparameter optimization and introduces a novel iterative method that combines CP, SVD, and Tucker tensor decompositions for model compression.

Abstract

Neural networks have revolutionized many aspects of society but in the era of huge models with billions of parameters, optimizing and deploying them for commercial applications can require significant computational and financial resources. To address these challenges, we introduce the Tetra-AML toolbox, which automates neural architecture search and hyperparameter optimization via a custom-developed black-box Tensor train Optimization algorithm, TetraOpt. The toolbox also provides model compression through quantization and pruning, augmented by compression using tensor networks. Here, we analyze a unified benchmark for optimizing neural networks in computer vision tasks and show the superior performance of our approach compared to Bayesian optimization on the CIFAR-10 dataset. We also demonstrate the compression of ResNet-18 neural networks, where we use 14.5 times less memory while losing just 3.2% of accuracy. The presented framework is generic, not limited by computer vision problems, supports hardware acceleration (such as with GPUs and TPUs) and can be further extended to quantum hardware and to hybrid quantum machine learning models.
Paper Structure (5 sections, 3 figures)

This paper contains 5 sections, 3 figures.

Figures (3)

  • Figure 1: General scheme of Tetra-AML. Both Neural Architecture Search (NAS) and Hyperparameter Optimization (HPO) are performed via TetraOpt (Terra Quantum's black-box optimizer based on Tensor Trains). Then the model is compressed via tensor network methods, quantization and pruning.
  • Figure 2: Validation accuracy dependence on the number of neural network models (architectures) runs for TetraOpt, Bayesian, and Random Search algorithms. TetraOpt achieves 93.7% accuracy, while Bayesian optimization and Random Search find architectures only with 93.5% and 92.8% accuracy, respectively, with the same number of model runs. The experiments were carried out on a standard NAS benchmark NATS, where the CIFAR-10 dataset is used.
  • Figure 3: (A) General scheme of Neural Network compression. (top) General Neural network scheme for image recognition. (bottom) Compressed Convolution layer - 4D convolution kernel is represented as a sum of tensor product of small tensors (Canonical Decomposition). Initial layer has $C_{in} \times C_{out} \times D^2$ parameters, while after compression remains only $C_{in} \times R + D^2 \times R^2 + R \times C_{out}$. For small $R$, it provides significant compression in the occupied memory. (B) Compression of state-of-the-art ResNet-18 on the CIFAR-10 dataset via tensor networks. The diagram shows the achieved accuracy depending on the compression of the model. Bottom bar: Uncompressed Base Model. Middle bar: TN Compressed Model (Compression coefficient: 4.5). Top bar: TN Compressed Model (Compression coefficient: 14.5).