Table of Contents
Fetching ...

Categorical Framework for Quantum-Resistant Zero-Trust AI Security

I. Cherkaoui, C. Clarke, J. Horgan, I. Dey

TL;DR

This work proposes a category-theoretic framework that unifies post-quantum cryptography with zero-trust architecture to secure AI at the edge. It centers on LWE augmented by Engel expansion-based deterministic randomness and provides formal categorical models (morphisms, functors, Yoneda embeddings, Kan extensions) that enable compositional security guarantees and crypto-agility. Empirically, the ESP32 implementation demonstrates sub-millisecond unauthorized access rejection, favorable memory footprints, and end-to-end latency where AI inference and network delays dominate, validating practicality for IoT/edge scenarios. The approach delivers theoretical reductions in key-sampling and computation, and presents an ITS-inspired, information-theoretic perspective via wire-tap channel modeling, highlighting the framework's potential for scalable, quantum-resistant security in real-time systems.

Abstract

The rapid deployment of AI models necessitates robust, quantum-resistant security, particularly against adversarial threats. Here, we present a novel integration of post-quantum cryptography (PQC) and zero trust architecture (ZTA), formally grounded in category theory, to secure AI model access. Our framework uniquely models cryptographic workflows as morphisms and trust policies as functors, enabling fine-grained, adaptive trust and micro-segmentation for lattice-based PQC primitives. This approach offers enhanced protection against adversarial AI threats. We demonstrate its efficacy through a concrete ESP32-based implementation, validating a crypto-agile transition with quantifiable performance and security improvements, underpinned by categorical proofs for AI security. The implementation achieves significant memory efficiency on ESP32, with the agent utilizing 91.86% and the broker 97.88% of free heap after cryptographic operations, and successfully rejects 100% of unauthorized access attempts with sub-millisecond average latency.

Categorical Framework for Quantum-Resistant Zero-Trust AI Security

TL;DR

This work proposes a category-theoretic framework that unifies post-quantum cryptography with zero-trust architecture to secure AI at the edge. It centers on LWE augmented by Engel expansion-based deterministic randomness and provides formal categorical models (morphisms, functors, Yoneda embeddings, Kan extensions) that enable compositional security guarantees and crypto-agility. Empirically, the ESP32 implementation demonstrates sub-millisecond unauthorized access rejection, favorable memory footprints, and end-to-end latency where AI inference and network delays dominate, validating practicality for IoT/edge scenarios. The approach delivers theoretical reductions in key-sampling and computation, and presents an ITS-inspired, information-theoretic perspective via wire-tap channel modeling, highlighting the framework's potential for scalable, quantum-resistant security in real-time systems.

Abstract

The rapid deployment of AI models necessitates robust, quantum-resistant security, particularly against adversarial threats. Here, we present a novel integration of post-quantum cryptography (PQC) and zero trust architecture (ZTA), formally grounded in category theory, to secure AI model access. Our framework uniquely models cryptographic workflows as morphisms and trust policies as functors, enabling fine-grained, adaptive trust and micro-segmentation for lattice-based PQC primitives. This approach offers enhanced protection against adversarial AI threats. We demonstrate its efficacy through a concrete ESP32-based implementation, validating a crypto-agile transition with quantifiable performance and security improvements, underpinned by categorical proofs for AI security. The implementation achieves significant memory efficiency on ESP32, with the agent utilizing 91.86% and the broker 97.88% of free heap after cryptographic operations, and successfully rejects 100% of unauthorized access attempts with sub-millisecond average latency.

Paper Structure

This paper contains 16 sections, 1 theorem, 13 equations, 14 figures, 20 tables.

Key Result

Theorem 1

The complexity is reduced to $O(n)$ with the category model through the Yoneda Embedding representing the vectors as $\hom(-, \mathbb{Z}_q)$ hom-functors, the inner products as coends $\langle a, s \rangle = \int^{k} \hom(k, a) \times \hom(k, s)$ (The coend generalizes the tensor product for bifunct

Figures (14)

  • Figure 1: ZTA design for signed using quantum-resistant SSL encryption and time-stamped requests, with a privilege LAN zone separated with access only upon authentication for pre-approved resources.
  • Figure 2: Heatmaps that compare the statistical divergence between ciphertext distributions of an LWE cryptosystem of the standard Gaussian noise versus deterministic noise obtained by Engel expansions under different parameters. In the first top row (from a to c), we have the Wasserstein distance, whereas in the bottom row (d to f), the Kullback-Leibler (KL) divergence is computed over 300 encryptions for each configuration of the message $m=0$. In subfigures (a) and (d), the noise standard deviation $\sigma$ and modulus $q$ vary with dimension fixed at $n=256$. Subfigures (b) and (e) fix $q=1024$ and vary $\sigma$ and $n$, while (c) and (f) fix $\sigma=8.0$ and vary both $n$ and $q$.
  • Figure 3: Density distributions of the ciphertext values with smoothly varying parameters $n$ (dimension), $\sigma$ (error standard deviation), and $q$ (modulus), putting standard Gaussian noise through a juxtaposition in front of deterministic randomness founded upon the Engel expansions. Each subplot fixes two parameters and varies the third so as to isolate the effect of each parameter on its own. The $q$ is varied in the top left and bottom right plots, while $n=512$ and $\sigma=8.0$ are fixed, showing a shift of peak concentration and spread therewith. The top middle and bottom left plots keep $n=512$, $q=16384$, with $\sigma$ varying, so noise increment smoothens the distribution. The $n$ is varied in the top right and bottom middle plots, fixing $\sigma=8.0$, $q=16384$, showing the impact of dimensionality in flattening and concentrating distribution.
  • Figure 4: Memory consumption timelines for LWE cryptographic operations under variations in parameters. Left: Dimension ($n$), varied as $n \in \{128,192,256,320,384\}$, showing quadratic memory consumption growth reaching a maximum of 225 kB during encryption. Center: Modulus ($q$), varied as $q \in \{1024,2048,4096,8192,16384\}$, demonstrating a logarithmic scale with distinct plateaus. Right: Noise ($\sigma$), varied as $\sigma \in \{2.0,3.2,4.0,5.0,6.0\}$, exhibiting a near-linear relation while preserving the timing for peak memory allocation. All plots share time characteristics from each other with the encryption phases (red-shaded intervals between 1 and 2 seconds) having a memory overhead over the baseline by 12-18 percent. The $y$-scale is made differing to reveal trends peculiar to each parameter while keeping within comparable time patterns.
  • Figure 5: Comparative Power Consumption Traces for AI Model Encryption on ESP32. Each subplot displays the oscilloscope capture ($10$ mV/div, $1$ ms/div) of ESP32 power draw during LWE encryption of responses from different AI models. (a) Llama Model Encryption: Baseline $60.0$ mA ($300$ mW), average $60.7$ mA ($303.4$ mW), and peak $95.8$ mA ($479.1$ mW) ($33.9$% above average). Distinct computational phases at $3$ ms (key initialization), $5.5$ ms (intermediate arithmetic), and $8$ ms (intensive lattice operations) are visible. Baseline fluctuations (around $2$ ms dip) signify background memory management. The spaced spike intervals hint at batched cryptographic processing[cite: 109, 110, 111, 112, 113, 114, 115, 116]. (b) GPT Model Encryption: Baseline $60.0$ mA ($300$ mW), average $60.5$ mA ($300.5$ mW) and peak $93.9$ mA ($469.3$ mW) ($55.1$% above average). A cluster of spikes at $5.5$ ms (parallelizable ops), $6.8$ ms (potential cache miss), and 11 ms (multi-round encryption) mark pipelined LWE computations. A higher peak-to-average ratio compared to Llama suggests more encryption rounds for longer messages[cite: 117, 118, 119, 120, 121, 122, 123]. (c) Mistral Model Encryption: Baseline: $60.0$ mA ($300$ mW), Average: $60.9$ mA ($304.5$ mW), Peak: $95.3$ mA ($476.7$ mW) ($56.6$% above average). Key events at $5.8$ ms (slightly delayed vs GPT due to pipeline stalls) and $11$ ms (dominant lattice ops). Peak timing and magnitude were near-identical to GPT ($1.1$% difference), strongly asserting shared computational patterns, likely from shared message structures or model architectures[cite: 124, 125, 126, 127, 128, 129, 130].
  • ...and 9 more figures

Theorems & Definitions (2)

  • Theorem 1
  • Proof 2