Table of Contents
Fetching ...

Robust Distributed Arrays: Provably Secure Networking for Data Availability Sampling

Dankrad Feist, Gottfried Herold, Mark Simkin, Benedikt Wagner

TL;DR

This work introduces Robust Distributed Arrays (RDAs) as a provably secure networking layer for Data Availability Sampling (DAS) in open permissionless networks. By organizing storage as a $k_1 imes k_2$ grid and employing a subnet-discovery mechanism, the construction achieves low per-node data storage, low read latency, and robustness against arbitrary adversaries without requiring an honest majority. The authors formalize a comprehensive security model, define a robust data-structure specification with two interfaces ($iStore$ and $iGet$) and a position-binding predicate, and prove that their construction satisfies this robustness under admissible join-leave schedules and a robust subnet protocol. They also provide efficiency analyses, extensions for practical deployment, and benchmarks illustrating performance and trade-offs. Overall, this work closes a key gap in DAS by supplying a secure, scalable networking substrate suitable for real-world deployment and broader distributed-system applications.

Abstract

Data Availability Sampling (DAS), a central component of Ethereum's roadmap, enables clients to verify data availability without requiring any single client to download the entire dataset. DAS operates by having clients randomly retrieve individual symbols of erasure-encoded data from a peer-to-peer network. While the cryptographic and encoding aspects of DAS have recently undergone formal analysis, the peer-to-peer networking layer remains underexplored, with a lack of security definitions and efficient, provably secure constructions. In this work, we address this gap by introducing a novel distributed data structure that can serve as the networking layer for DAS, which we call robust distributed arrays. That is, we rigorously define a robustness property of a distributed data structure in an open permissionless network, that mimics a collection of arrays. Then, we give a simple and efficient construction and formally prove its robustness. Notably, every individual node is required to store only small portions of the data, and accessing array positions incurs minimal latency. The robustness of our construction relies solely on the presence of a minimal absolute number of honest nodes in the network. In particular, we avoid any honest majority assumption. Beyond DAS, we anticipate that robust distributed arrays can have wider applications in distributed systems.

Robust Distributed Arrays: Provably Secure Networking for Data Availability Sampling

TL;DR

This work introduces Robust Distributed Arrays (RDAs) as a provably secure networking layer for Data Availability Sampling (DAS) in open permissionless networks. By organizing storage as a grid and employing a subnet-discovery mechanism, the construction achieves low per-node data storage, low read latency, and robustness against arbitrary adversaries without requiring an honest majority. The authors formalize a comprehensive security model, define a robust data-structure specification with two interfaces ( and ) and a position-binding predicate, and prove that their construction satisfies this robustness under admissible join-leave schedules and a robust subnet protocol. They also provide efficiency analyses, extensions for practical deployment, and benchmarks illustrating performance and trade-offs. Overall, this work closes a key gap in DAS by supplying a secure, scalable networking substrate suitable for real-world deployment and broader distributed-system applications.

Abstract

Data Availability Sampling (DAS), a central component of Ethereum's roadmap, enables clients to verify data availability without requiring any single client to download the entire dataset. DAS operates by having clients randomly retrieve individual symbols of erasure-encoded data from a peer-to-peer network. While the cryptographic and encoding aspects of DAS have recently undergone formal analysis, the peer-to-peer networking layer remains underexplored, with a lack of security definitions and efficient, provably secure constructions. In this work, we address this gap by introducing a novel distributed data structure that can serve as the networking layer for DAS, which we call robust distributed arrays. That is, we rigorously define a robustness property of a distributed data structure in an open permissionless network, that mimics a collection of arrays. Then, we give a simple and efficient construction and formally prove its robustness. Notably, every individual node is required to store only small portions of the data, and accessing array positions incurs minimal latency. The robustness of our construction relies solely on the presence of a minimal absolute number of honest nodes in the network. In particular, we avoid any honest majority assumption. Beyond DAS, we anticipate that robust distributed arrays can have wider applications in distributed systems.

Paper Structure

This paper contains 28 sections, 27 theorems, 54 equations, 10 figures.

Key Result

Proposition 1

Let $\party$ be an honest party and $0\leq \tau_0 \leq \tau_1$ be points in time. Then, for any point in time $\tau$ with $\tau_0 \leq \tau \leq \tau_1$, we have Consequently, we have

Figures (10)

  • Figure 1: Visualization of our protocol $\ourprot$ with $k_1 = 4$ rows and $k_2 = 5$ columns. Each row and column represents a subnet (i.e., clique) for the subnet discovery protocol $\subnetprot$ as per \ref{['def:subnetdiscovery']}. Parties in column $j \in [k_2]$ store the $j$th chunk of the data. For simplicity, we assume that the number of symbols is $\FileLength = k_2$ in this visualization. Left: node in column subnet $2$ and row subnet $3$ storing a symbol that must be stored in column subnet $4$. First, the node sends it to all nodes in cell $(3,4)$, then every such node forwards it to all nodes in that column. Right: node in column subnet $5$ and row subnet $1$ querying a symbol that must be stored in column subnet $4$. First, the node sends a query to all nodes in cell $(1,4)$. Then, all nodes respond with the symbol.
  • Figure 2: Initialization and helper algorithms for our distributed array protocol $\ourprot$. It makes use of a subnet discovery protocol $\subnetprot$ for $\numberofsubnets = k_1 + k_2$ and a random oracle $\Hash\colon\boolstar\rightarrow[k_1]\times[k_2]$. We present code for joining in \ref{['fig:ourprotocol:joining']} and interfaces $\iStore$ and $\iGet$ in \ref{['fig:ourprotocol:storeget']}.
  • Figure 3: Joining instructions for our distributed array protocol $\ourprot$ with file space $\SymbolSpace^\FileLength$ and handle space $\HandleSpace$, for a predicate $\rdspred\colon \HandleSpace \times [\FileLength] \times \SymbolSpace \to\bool$. It makes use of a subnet discovery protocol $\subnetprot$ for $\numberofsubnets = k_1 + k_2$ and a random oracle $\Hash\colon\boolstar\to[k_1]\times[k_2]$. The party that is executing the instructions is referred to as $\self$. We present the initialization code in \ref{['fig:ourprotocol:initialization']}, interfaces $\iStore$ and $\iGet$ in \ref{['fig:ourprotocol:storeget']}.
  • Figure 4: Interfaces $\iStore$ and $\iGet$ of $\ourprot$ with file space $\SymbolSpace^\FileLength$ and handle space $\HandleSpace$, for a predicate $\rdspred\colon \HandleSpace \times [\FileLength] \times \SymbolSpace \rightarrow \bool$. It makes use of a subnet discovery protocol $\subnetprot$ for $\numberofsubnets = k_1 + k_2$ and a random oracle $\Hash\colon\boolstar\rightarrow[k_1]\times[k_2]$. The party that is executing the instructions is referred to as $\self$. Initialization and helper algorithms are defined in \ref{['fig:ourprotocol:initialization']}, and code for joining in \ref{['fig:ourprotocol:joining']}.
  • Figure 5: Trade-off between $k_1$ (number of rows) and $k_2$ (number of columns) and resulting complexities for $\RelPosCorrThresh = 0.1$. The complexities show the expected total message complexity when an honest party calls the respective interfaces. All plots assume $\RobustnessError \leq 10^{-9}$, and the given $N$. Scales are logarithmic.
  • ...and 5 more figures

Theorems & Definitions (64)

  • Definition 1: Security Experiment
  • Remark 1: Setup
  • Definition 2: Activity Events
  • Proposition 1: Activity is an Interval
  • proof
  • Definition 3: Distributed Array
  • Definition 4: Robustness
  • Remark 2: Position-Binding
  • Remark 3: No Secrecy
  • Remark 4: Agreement on the Handle
  • ...and 54 more