Table of Contents
Fetching ...

Knowledge is Overrated: A zero-knowledge machine learning and cryptographic hashing-based framework for verifiable, low latency inference at the LHC

Pratik Jawahar, Caterina Doglioni, Maurizio Pierini

TL;DR

The paper addresses the challenge of delivering verifiable, ultra-low-latency inference for LHC triggers beyond small-footprint models by proposing PHAZE, a two-stage framework that combines probabilistic hashing with zero-knowledge machine learning (zkML) and an early-exit mechanism. In the build phase, PHAZE compresses large baseline model activations into injective polynomials, generates Rabin fingerprints, and constructs a Verifiable Decision Map (VDM) that is certified by zk-STARK proofs; online, it performs on-the-fly hashing and VDM lookups to yield decisions with nanosecond-scale latency while maintaining verifiability. The key contributions include the formal integration of Rabin fingerprinting, polynomial interpolation, zk-STARK-based CI proofs, and a practically bounded online latency estimate ($T_{ ext{online}}$ on the order of $10^{-7}$–$10^{-6}$ s), along with a workflow for anomaly detection via map-misses and a roadmap for dynamic, updatable VDMs. The framework aims to enable dynamic, ML-based trigger decisions at the LHC with built-in data-quality verification and potential for distributed hardware deployment, addressing both performance and reproducibility needs for future high-energy physics experiments.

Abstract

Low latency event-selection (trigger) algorithms are essential components of Large Hadron Collider (LHC) operation. Modern machine learning (ML) models have shown great offline performance as classifiers and could improve trigger performance, thereby improving downstream physics analyses. However, inference on such large models does not satisfy the $40\text{MHz}$ online latency constraint at the LHC. In this work, we propose \texttt{PHAZE}, a novel framework built on cryptographic techniques like hashing and zero-knowledge machine learning (zkML) to achieve low latency inference, via a certifiable, early-exit mechanism from an arbitrarily large baseline model. We lay the foundations for such a framework to achieve nanosecond-order latency and discuss its inherent advantages, such as built-in anomaly detection, within the scope of LHC triggers, as well as its potential to enable a dynamic low-level trigger in the future.

Knowledge is Overrated: A zero-knowledge machine learning and cryptographic hashing-based framework for verifiable, low latency inference at the LHC

TL;DR

The paper addresses the challenge of delivering verifiable, ultra-low-latency inference for LHC triggers beyond small-footprint models by proposing PHAZE, a two-stage framework that combines probabilistic hashing with zero-knowledge machine learning (zkML) and an early-exit mechanism. In the build phase, PHAZE compresses large baseline model activations into injective polynomials, generates Rabin fingerprints, and constructs a Verifiable Decision Map (VDM) that is certified by zk-STARK proofs; online, it performs on-the-fly hashing and VDM lookups to yield decisions with nanosecond-scale latency while maintaining verifiability. The key contributions include the formal integration of Rabin fingerprinting, polynomial interpolation, zk-STARK-based CI proofs, and a practically bounded online latency estimate ( on the order of s), along with a workflow for anomaly detection via map-misses and a roadmap for dynamic, updatable VDMs. The framework aims to enable dynamic, ML-based trigger decisions at the LHC with built-in data-quality verification and potential for distributed hardware deployment, addressing both performance and reproducibility needs for future high-energy physics experiments.

Abstract

Low latency event-selection (trigger) algorithms are essential components of Large Hadron Collider (LHC) operation. Modern machine learning (ML) models have shown great offline performance as classifiers and could improve trigger performance, thereby improving downstream physics analyses. However, inference on such large models does not satisfy the online latency constraint at the LHC. In this work, we propose \texttt{PHAZE}, a novel framework built on cryptographic techniques like hashing and zero-knowledge machine learning (zkML) to achieve low latency inference, via a certifiable, early-exit mechanism from an arbitrarily large baseline model. We lay the foundations for such a framework to achieve nanosecond-order latency and discuss its inherent advantages, such as built-in anomaly detection, within the scope of LHC triggers, as well as its potential to enable a dynamic low-level trigger in the future.

Paper Structure

This paper contains 24 sections, 5 equations, 1 figure, 1 table.

Figures (1)

  • Figure 1: Early build-phase feasibility plots showing time complexity scaling for ezkl proof generation (top-left), verification (top-right) and fingerprinting (middle-left); net fingerprinting throughput (middle-right); peak memory consumption for ezkl proof generation (bottom-left) and fingerprinting (bottom-right). All values are reported per event, across $10$ independent trials. Note that for Shamir Secret Sharing to act as a fingerprinting algorithm, the timing information is roughly the sum of the Share and Reconstruct stages, while the peak memory consumption is the greater of the two at any given operation point.