Table of Contents
Fetching ...

Canonical Byte-String Encoding for Finite-Ring Cryptosystems

Kyrylo Riabov, Serhii Kryvyi

Abstract

Ring-mapping protocols need a canonical byte-to-residue layer before any algebraic encryption step can begin. This paper isolates that layer and presents the base-m length codec, a canonical map from byte strings of length less than 2^64 to lists of residues modulo m. The encoder builds on and adapts an rANS-based system proposed by Duda. Decoding is exact for all moduli satisfying the paper's parameter bounds. Because the encoding carries the byte length in its fixed-width header, decoding is also tolerant to appended valid suffix digits. The paper is accompanied by a Rust implementation of the described protocol, a Lean 4 formalization of the abstract codec with machine-checked proofs, and performance benchmarks. The Lean 4 formalization establishes fixed-width prefix inversion and payload-state bounds below 2^64, stream-level roundtrip correctness, and that every emitted symbol is a valid residue modulo m. We conclude with a complexity analysis and a discussion of practical considerations arising in real-world use of the codec.

Canonical Byte-String Encoding for Finite-Ring Cryptosystems

Abstract

Ring-mapping protocols need a canonical byte-to-residue layer before any algebraic encryption step can begin. This paper isolates that layer and presents the base-m length codec, a canonical map from byte strings of length less than 2^64 to lists of residues modulo m. The encoder builds on and adapts an rANS-based system proposed by Duda. Decoding is exact for all moduli satisfying the paper's parameter bounds. Because the encoding carries the byte length in its fixed-width header, decoding is also tolerant to appended valid suffix digits. The paper is accompanied by a Rust implementation of the described protocol, a Lean 4 formalization of the abstract codec with machine-checked proofs, and performance benchmarks. The Lean 4 formalization establishes fixed-width prefix inversion and payload-state bounds below 2^64, stream-level roundtrip correctness, and that every emitted symbol is a valid residue modulo m. We conclude with a complexity analysis and a discussion of practical considerations arising in real-world use of the codec.
Paper Structure (19 sections, 3 theorems, 14 equations, 1 figure, 2 tables, 2 algorithms)

This paper contains 19 sections, 3 theorems, 14 equations, 1 figure, 2 tables, 2 algorithms.

Key Result

Theorem 4.1

For every supported modulus $m$ and every admissible byte string $\mathit{bytes} \in \mathsf{ByteString}_{64}$, Moreover, for every valid suffix $t \in \mathsf{List}(\mathsf{Fin}(m))$,

Figures (1)

  • Figure 5.1: Representation cost as a function of the modulus $m$ over the range $25 \le m \le 257$. The upper panel shows the smooth payload rate $\log_m 256$, while the lower panel shows the stepwise fixed header cost $2k$.

Theorems & Definitions (8)

  • Definition 2.1: Supported modulus
  • Definition 2.2: Admissible byte string
  • Definition 3.1: Prefix width
  • Definition 3.2: Wire format
  • Example 3.1: Worked example for "Hi" at $m=50$
  • Theorem 4.1: Stream correctness
  • Theorem 4.2: Header correctness
  • Proposition 4.3: Payload-state bounds