Ultra-low latency quantum-inspired machine learning predictors implemented on FPGA
Lorenzo Borella, Alberto Coppi, Jacopo Pazzini, Andrea Stanco, Marco Trenti, Andrea Triossi, Marco Zanetti
TL;DR
This paper investigates ultra-low-latency, quantum-inspired machine learning using Tree Tensor Networks (TTNs) deployed on FPGA hardware. It analyzes TTN architectures and training strategies, introduces two hardware contraction schemes (Full Parallel and Partial Parallel), and provides detailed resource, latency, and quantization assessments. Through Iris, Titanic, and LHCb datasets, including an LHCb predictor implemented in hardware, the work demonstrates sub-microsecond inference with exact software–hardware reproducibility and shows how correlation/entropy insights enable network compression for FPGA deployment. The results highlight the practical viability of TTN-based classifiers in real-time High Energy Physics triggering pipelines and offer concrete hardware designs and guidelines for quantized, resource-efficient TTN inference.
Abstract
Tensor Networks (TNs) are a computational paradigm used for representing quantum many-body systems. Recent works have shown how TNs can also be applied to perform Machine Learning (ML) tasks, yielding comparable results to standard supervised learning techniques. In this work, we study the use of Tree Tensor Networks (TTNs) in high-frequency real-time applications by exploiting the low-latency hardware of the Field-Programmable Gate Array (FPGA) technology. We present different implementations of TTN classifiers, capable of performing inference on classical ML datasets as well as on complex physics data. A preparatory analysis of bond dimensions and weight quantization is realized in the training phase, together with entanglement entropy and correlation measurements, that help setting the choice of the TTN architecture. The generated TTNs are then deployed on a hardware accelerator; using an FPGA integrated into a server, the inference of the TTN is completely offloaded. Eventually, a classifier for High Energy Physics (HEP) applications is implemented and executed fully pipelined with sub-microsecond latency.
