Table of Contents
Fetching ...

CertainSync: Rateless Set Reconciliation with Certainty

Tomer Keniagin, Eitan Yaakobi, Ori Rottenstreich

TL;DR

CertainSync introduces a parameter-free, rateless framework for set reconciliation that guarantees exact recovery of the symmetric difference without estimating $| Delta|$, by employing $d$-decodable rateless matrices mapped to IBLTs. It provides three concrete constructions (EGH, OLS, Extended Hamming) with provable certainty guarantees and demonstrates competitive performance against Rateless IBLT and Graphene, plus UniverseReduceSync for large universes such as blockchain TxHash data. The approach enables robust reconciliation under unknown $| Delta|$ with overhead bounds tied to $| Delta|$ and the universe size $n$, and offers a practical pathway for blockchain synchronization through universe-size reduction. The work also outlines future directions, including new combinatorial constructions and extending the framework to multi-party reconciliation and detailed runtime analyses in real-world networks.

Abstract

Set reconciliation is a fundamental task in distributed systems, particularly in blockchain networks, where it enables synchronization of transaction pools among peers and facilitates block dissemination. Traditional set reconciliation schemes are either statistical, offering success probability as a function of communication overhead and symmetric difference size, or require parametrization and estimation of that size, which can be error-prone. We present CertainSync, a novel reconciliation framework that, to the best of our knowledge, is the first to guarantee successful set reconciliation without any parametrization or estimators. The framework is rateless and adapts to the unknown symmetric difference size. Reconciliation is guaranteed whenever the communication overhead reaches a lower bound derived from the symmetric difference size and universe size. Our framework builds on recent constructions of Invertible Bloom Lookup Tables (IBLTs), ensuring successful element listing as long as the number of elements is bounded. We provide a theoretical analysis proving the certainty of reconciliation for multiple constructions. Our approach is validated by simulations, showing the ability to synchronize sets with efficient communication costs while maintaining guarantees compared to baseline schemes. To further reduce overhead in large universes such as blockchain networks, CertainSync is extended with a universe reduction technique. We compare and validate this extension, UniverseReduceSync, against the basic framework using real Ethereum transaction hash data. Results show a trade-off between lower communication costs and maintaining guarantees, offering a comprehensive solution for diverse reconciliation scenarios.

CertainSync: Rateless Set Reconciliation with Certainty

TL;DR

CertainSync introduces a parameter-free, rateless framework for set reconciliation that guarantees exact recovery of the symmetric difference without estimating , by employing -decodable rateless matrices mapped to IBLTs. It provides three concrete constructions (EGH, OLS, Extended Hamming) with provable certainty guarantees and demonstrates competitive performance against Rateless IBLT and Graphene, plus UniverseReduceSync for large universes such as blockchain TxHash data. The approach enables robust reconciliation under unknown with overhead bounds tied to and the universe size , and offers a practical pathway for blockchain synchronization through universe-size reduction. The work also outlines future directions, including new combinatorial constructions and extending the framework to multi-party reconciliation and detailed runtime analyses in real-world networks.

Abstract

Set reconciliation is a fundamental task in distributed systems, particularly in blockchain networks, where it enables synchronization of transaction pools among peers and facilitates block dissemination. Traditional set reconciliation schemes are either statistical, offering success probability as a function of communication overhead and symmetric difference size, or require parametrization and estimation of that size, which can be error-prone. We present CertainSync, a novel reconciliation framework that, to the best of our knowledge, is the first to guarantee successful set reconciliation without any parametrization or estimators. The framework is rateless and adapts to the unknown symmetric difference size. Reconciliation is guaranteed whenever the communication overhead reaches a lower bound derived from the symmetric difference size and universe size. Our framework builds on recent constructions of Invertible Bloom Lookup Tables (IBLTs), ensuring successful element listing as long as the number of elements is bounded. We provide a theoretical analysis proving the certainty of reconciliation for multiple constructions. Our approach is validated by simulations, showing the ability to synchronize sets with efficient communication costs while maintaining guarantees compared to baseline schemes. To further reduce overhead in large universes such as blockchain networks, CertainSync is extended with a universe reduction technique. We compare and validate this extension, UniverseReduceSync, against the basic framework using real Ethereum transaction hash data. Results show a trade-off between lower communication costs and maintaining guarantees, offering a comprehensive solution for diverse reconciliation scenarios.

Paper Structure

This paper contains 42 sections, 9 theorems, 17 equations, 32 figures, 5 tables, 5 algorithms.

Key Result

theorem 1

For $n$ and $d$, the EGH matrix $M^{I}_{n,d}$ is a $(d{+}1)$-decodable rateless matrix. For $2\leq i\leq d+1$, the decodability profile is $m_i = \sum_{j=1}^{k_i} p_j,$ where $k_i$ is the smallest integer such that $\Pi_{k_i}\geq n^{i}$. Furthermore, $m_1=m_2$.

Figures (32)

  • Figure 1: Comparison of set reconciliation schemes based on three metrics: (i) parametrization tuning and/or symmetric difference size estimation overhead (lower is better), (ii) certainty of success (higher is better), and (iii) rateless adaptability (higher is better) indicates efficiency by dynamically adapting to varying set differences while minimizing communication overhead and retransmissions. A detailed explanation of the parameters and estimators used for each scheme is presented in Table \ref{['tab:parameterization']} in Appendix \ref{['appx:set_reconciliation_solutions']}.
  • Figure 3: Blockchain network modeled as a peer-to-peer (P2P) system where each peer maintains a local ledger (immutable chain of blocks) and a transaction pool (TxPool). Peers synchronize transactions and mined blocks to achieve consensus across the network.
  • Figure : (a) Two Latin squares of order 5.
  • Figure : (a) Traditional Set Reconciliation
  • Figure : (a) Simplified case where $S_2$ is a superset of $S_1$ ($S_1 \subseteq S_2$).
  • ...and 27 more figures

Theorems & Definitions (25)

  • definition 1: Symmetric Difference
  • definition 2: Extended Hamming Code
  • definition 3: Stopping Set
  • definition 4: Rateless Coding
  • definition 5: $\del{n,d}$-LFFZ
  • definition 6: Binary Mapping Matrix $M$
  • definition 7: $d$-decodable matrix
  • definition 8: Set Reconciliation with Certainty
  • definition 9: $d$-decodable Rateless Matrix
  • definition 10
  • ...and 15 more