Table of Contents
Fetching ...

Ultra-low latency quantum-inspired machine learning predictors implemented on FPGA

Lorenzo Borella, Alberto Coppi, Jacopo Pazzini, Andrea Stanco, Marco Trenti, Andrea Triossi, Marco Zanetti

TL;DR

This paper investigates ultra-low-latency, quantum-inspired machine learning using Tree Tensor Networks (TTNs) deployed on FPGA hardware. It analyzes TTN architectures and training strategies, introduces two hardware contraction schemes (Full Parallel and Partial Parallel), and provides detailed resource, latency, and quantization assessments. Through Iris, Titanic, and LHCb datasets, including an LHCb predictor implemented in hardware, the work demonstrates sub-microsecond inference with exact software–hardware reproducibility and shows how correlation/entropy insights enable network compression for FPGA deployment. The results highlight the practical viability of TTN-based classifiers in real-time High Energy Physics triggering pipelines and offer concrete hardware designs and guidelines for quantized, resource-efficient TTN inference.

Abstract

Tensor Networks (TNs) are a computational paradigm used for representing quantum many-body systems. Recent works have shown how TNs can also be applied to perform Machine Learning (ML) tasks, yielding comparable results to standard supervised learning techniques. In this work, we study the use of Tree Tensor Networks (TTNs) in high-frequency real-time applications by exploiting the low-latency hardware of the Field-Programmable Gate Array (FPGA) technology. We present different implementations of TTN classifiers, capable of performing inference on classical ML datasets as well as on complex physics data. A preparatory analysis of bond dimensions and weight quantization is realized in the training phase, together with entanglement entropy and correlation measurements, that help setting the choice of the TTN architecture. The generated TTNs are then deployed on a hardware accelerator; using an FPGA integrated into a server, the inference of the TTN is completely offloaded. Eventually, a classifier for High Energy Physics (HEP) applications is implemented and executed fully pipelined with sub-microsecond latency.

Ultra-low latency quantum-inspired machine learning predictors implemented on FPGA

TL;DR

This paper investigates ultra-low-latency, quantum-inspired machine learning using Tree Tensor Networks (TTNs) deployed on FPGA hardware. It analyzes TTN architectures and training strategies, introduces two hardware contraction schemes (Full Parallel and Partial Parallel), and provides detailed resource, latency, and quantization assessments. Through Iris, Titanic, and LHCb datasets, including an LHCb predictor implemented in hardware, the work demonstrates sub-microsecond inference with exact software–hardware reproducibility and shows how correlation/entropy insights enable network compression for FPGA deployment. The results highlight the practical viability of TTN-based classifiers in real-time High Energy Physics triggering pipelines and offer concrete hardware designs and guidelines for quantized, resource-efficient TTN inference.

Abstract

Tensor Networks (TNs) are a computational paradigm used for representing quantum many-body systems. Recent works have shown how TNs can also be applied to perform Machine Learning (ML) tasks, yielding comparable results to standard supervised learning techniques. In this work, we study the use of Tree Tensor Networks (TTNs) in high-frequency real-time applications by exploiting the low-latency hardware of the Field-Programmable Gate Array (FPGA) technology. We present different implementations of TTN classifiers, capable of performing inference on classical ML datasets as well as on complex physics data. A preparatory analysis of bond dimensions and weight quantization is realized in the training phase, together with entanglement entropy and correlation measurements, that help setting the choice of the TTN architecture. The generated TTNs are then deployed on a hardware accelerator; using an FPGA integrated into a server, the inference of the TTN is completely offloaded. Eventually, a classifier for High Energy Physics (HEP) applications is implemented and executed fully pipelined with sub-microsecond latency.
Paper Structure (15 sections, 7 equations, 17 figures, 1 table)

This paper contains 15 sections, 7 equations, 17 figures, 1 table.

Figures (17)

  • Figure 1: Tree Tensor Network $b/\bar{b}$ jet classifier implemented on FPGA with sub-microsecond prediction latency.
  • Figure 2: Example architecture with $N=16$ input features and dimensions $\chi_l=[D,\chi_1,\chi_2,\chi_{L-1},O]$.
  • Figure 3: Two-site $\sigma_z$ correlations between features of the titanic dataset as learned by the model.
  • Figure 4: The bipartite entanglement entropy of the features of the titanic dataset. This is measured by cutting the physical bonds of the TTN. The red line is the maximum entropy allowed on those bonds.
  • Figure 5: Example of full network contraction for a TTN with $N=8$: the final vector on the right (scalar for $O=1$) is the result of inference.
  • ...and 12 more figures