Scalability limitations of Kademlia DHTs when enabling Data Availability Sampling in Ethereum
Mikel Cortes-Goicoechea, Csaba Kiraly, Dmitriy Ryajov, Jose Luis Muñoz-Tapia, Leonardo Bautista-Gomez
TL;DR
This work investigates whether a Kademlia-based DHT can support Data Availability Sampling for Ethereum’s DankSharding, using a configurable DAS-DHT simulator validated against IPFS experiments. The study finds that while DHT lookups can achieve sub-second latency at the 90th percentile under favorable conditions, the seeding burden of distributing hundreds of thousands of small samples across thousands of nodes presents a major scalability bottleneck. Single-seeder and centralized seeding approaches quickly become impractical within ETH DAS time constraints, and even distributed seeding via GossipSub or alternative routing strategies carry substantial trade-offs in load, latency, and privacy. The results suggest that a pure DHT-based seeding path may be insufficient for scalable DAS, motivating exploration of hybrid or alternative data availability mechanisms that maintain decentralization while reducing seeding overhead.
Abstract
Scalability in blockchain remains a significant challenge, especially when prioritizing decentralization and security. The Ethereum community has proposed comprehensive data-sharding techniques to overcome storage, computational, and network processing limitations. In this context, the propagation and availability of large blocks become the subject of research to achieve scalable data-sharding. This paper provides insights after exploring the usage of a Kademlia-based DHT to enable Data Availability Sampling (DAS) in Ethereum. It presents a DAS-DHT simulator to study this problem and validates the results of the simulator with experiments in a real DHT network, IPFS. Our results help us understand what parts of DAS can be achieved based on existing Kademlia DHT solutions and which ones cannot. We discuss the limitations of DHT solutions and discuss other alternatives.
