Knowledge is Overrated: A zero-knowledge machine learning and cryptographic hashing-based framework for verifiable, low latency inference at the LHC
Pratik Jawahar, Caterina Doglioni, Maurizio Pierini
TL;DR
The paper addresses the challenge of delivering verifiable, ultra-low-latency inference for LHC triggers beyond small-footprint models by proposing PHAZE, a two-stage framework that combines probabilistic hashing with zero-knowledge machine learning (zkML) and an early-exit mechanism. In the build phase, PHAZE compresses large baseline model activations into injective polynomials, generates Rabin fingerprints, and constructs a Verifiable Decision Map (VDM) that is certified by zk-STARK proofs; online, it performs on-the-fly hashing and VDM lookups to yield decisions with nanosecond-scale latency while maintaining verifiability. The key contributions include the formal integration of Rabin fingerprinting, polynomial interpolation, zk-STARK-based CI proofs, and a practically bounded online latency estimate ($T_{ ext{online}}$ on the order of $10^{-7}$–$10^{-6}$ s), along with a workflow for anomaly detection via map-misses and a roadmap for dynamic, updatable VDMs. The framework aims to enable dynamic, ML-based trigger decisions at the LHC with built-in data-quality verification and potential for distributed hardware deployment, addressing both performance and reproducibility needs for future high-energy physics experiments.
Abstract
Low latency event-selection (trigger) algorithms are essential components of Large Hadron Collider (LHC) operation. Modern machine learning (ML) models have shown great offline performance as classifiers and could improve trigger performance, thereby improving downstream physics analyses. However, inference on such large models does not satisfy the $40\text{MHz}$ online latency constraint at the LHC. In this work, we propose \texttt{PHAZE}, a novel framework built on cryptographic techniques like hashing and zero-knowledge machine learning (zkML) to achieve low latency inference, via a certifiable, early-exit mechanism from an arbitrarily large baseline model. We lay the foundations for such a framework to achieve nanosecond-order latency and discuss its inherent advantages, such as built-in anomaly detection, within the scope of LHC triggers, as well as its potential to enable a dynamic low-level trigger in the future.
