The Derivation Penalty in Premise-Erasure Caching: Capacity, Strong Converse, and Dispersion Dichotomy
Jianfeng Xu
TL;DR
An information-theoretic framework for caching in derivation-based reasoning engines under independent premise erasure, and the ratio of the derivation-constrained cache to the coded cache converges to the reciprocal of the erasure rate, universally across query counts, overlap structures, and reliability targets.
Abstract
We introduce an information-theoretic framework for caching in derivation-based reasoning engines under independent premise erasure. Two decoder models are compared: a coded scheme using an arbitrary bit-string cache with a general-purpose decoder, and a derivation-constrained scheme where the cache consists of logical facts and the decoder must produce a valid proof. Four coding theorems are established. The first proves that each derivation step carries a universal per-step information content determined by the base size. The second reveals an exponential capacity separation between linear-chain and balanced-merge Datalog architectures at equal depth. The third identifies a critical access frequency separating the regimes where caching and on-demand derivation are optimal. The fourth determines the minimum derivation-constrained cache under erasure, decomposing query information into reliable cache and noisy channel capacity. The central result is the derivation penalty: the ratio of the derivation-constrained cache to the coded cache converges to the reciprocal of the erasure rate, universally across query counts, overlap structures, and reliability targets. This penalty originates from a structural caching rigidity theorem showing that only cache facts within the target query's derivation DAG contribute to resilience, precluding cross-coordinate error correction. Beyond capacity, we prove a strong converse at the KL-divergence rate with Bahadur--Rao prefactors, a dispersion dichotomy (positive coded dispersion versus zero derivation-constrained dispersion), and a complete eight-regime phase diagram. The architecture-dependent depth-to-dependency mapping yields exponentially sharper phase transitions for the merge architecture. All results transfer across synonymous representations.
