Table of Contents
Fetching ...

Parsing Millions of DNS Records per Second

Jeroen Koekkoek, Daniel Lemire

TL;DR

This work presents simdzone, a SIMD-accelerated two-stage parser for DNS zone files, designed to overcome the long-standing bottleneck of in-memory zone-file parsing. By processing $64$-byte blocks with vectorized classification, employing a $67$-entry perfect hash for RTYPE identification, and optimizing domain-name and binary-data handling, simdzone achieves approximately $1$ GB/s throughput—about $3\times$ faster than Knot DNS and an order of magnitude faster than NSD in the reported benchmarks. The approach closely mirrors the success of simdjson by separating indexing from parsing and reducing branches and instructions, while tailoring techniques to zone-file grammar (including directives, escapes, and RDATA formats). The authors provide an open-source implementation and benchmarking framework, enabling reproducibility and potential integration into DNS servers such as NSD and other tooling to dramatically speed up zone loading and updates.

Abstract

The Domain Name System (DNS) plays a critical role in the functioning of the Internet. It provides a hierarchical name space for locating resources. Data is typically stored in plain text files, possibly spanning gigabytes. Frequent parsing of these files to refresh the data is computationally expensive: processing a zone file can take minutes. We propose a novel approach called simdzone to enhance DNS parsing throughput. We use data parallelism, specifically the Single Instruction Multiple Data (SIMD) instructions available on commodity processors. We show that we can multiply the parsing speed compared to state-of-the-art parsers found in Knot DNS and the NLnet Labs Name Server Daemon (NSD). The resulting software library replaced the parser in NSD.

Parsing Millions of DNS Records per Second

TL;DR

This work presents simdzone, a SIMD-accelerated two-stage parser for DNS zone files, designed to overcome the long-standing bottleneck of in-memory zone-file parsing. By processing -byte blocks with vectorized classification, employing a -entry perfect hash for RTYPE identification, and optimizing domain-name and binary-data handling, simdzone achieves approximately GB/s throughput—about faster than Knot DNS and an order of magnitude faster than NSD in the reported benchmarks. The approach closely mirrors the success of simdjson by separating indexing from parsing and reducing branches and instructions, while tailoring techniques to zone-file grammar (including directives, escapes, and RDATA formats). The authors provide an open-source implementation and benchmarking framework, enabling reproducibility and potential integration into DNS servers such as NSD and other tooling to dramatically speed up zone loading and updates.

Abstract

The Domain Name System (DNS) plays a critical role in the functioning of the Internet. It provides a hierarchical name space for locating resources. Data is typically stored in plain text files, possibly spanning gigabytes. Frequent parsing of these files to refresh the data is computationally expensive: processing a zone file can take minutes. We propose a novel approach called simdzone to enhance DNS parsing throughput. We use data parallelism, specifically the Single Instruction Multiple Data (SIMD) instructions available on commodity processors. We show that we can multiply the parsing speed compared to state-of-the-art parsers found in Knot DNS and the NLnet Labs Name Server Daemon (NSD). The resulting software library replaced the parser in NSD.

Paper Structure

This paper contains 15 sections, 9 figures, 3 tables.

Figures (9)

  • Figure 1: Example of a simple zone file
  • Figure 2: Augmented Backus–Naur form (ABNF) grammar for items in resource records
  • Figure 3: Bit representation of a 64-byte zone file input. After computing the 'fields' value, the indexer would write 0, 14, 20, 24, 29, 63 to the first index array---values corresponding to the location of the 1s in 'fields' and 12, 18, 22, 27, 55 to the second index array---values corresponding to the location of the 1s in 'delimiters'.
  • Figure 4: Structure in C containing a 64-byte block
  • Figure 5: Tables used for vectorized classification of the blank and special characters
  • ...and 4 more figures