Table of Contents
Fetching ...

The Ubiquitous Skiplist: A Survey of What Cannot be Skipped About the Skiplist and its Applications in Big Data Systems

Lu Xing, Venkata Sai Pavan Kumar Vadrevu, Walid G. Aref

TL;DR

This survey systematically catalogs skiplist variants, their probabilistic foundations, and their deployment in modern big-data systems. It covers fundamental operations with their $O(\,\log n\,)$ time behavior, then surveys concurrent, deterministic, and skew-aware designs, including MVCC and GPU-enabled skiplists. The paper also maps skiplists to related data structures (notably the B/B$^{+}$-tree) and details their integration with LSM-trees, KV stores, and complex data indexing (multi-dimensional and interval variants). It further explains hardware-conscious optimizations (cache locality, external memory, NUMA, NMP, GPU) and diverse applications from KV stores to networks and blockchain, highlighting the skiplist’s versatility and scalability. The work provides a forward-looking view on adapting skiplists to emerging hardware and workloads, and outlines open challenges such as ML-guided skiplist design and deeper deterministic- probabilistic trade-offs.

Abstract

Skiplists have become prevalent in systems. The main advantages of skiplists are their simplicity and ease of implementation, and the ability to support operations in the same asymptotic complexities as their tree-based counterparts. In this survey, we explore skiplists and their many variants. We highlight many scenarios about how skiplists are useful, and how they fit well in these usage scenarios. We also compare skiplists with other data structures, especially tree-based structures. Extensions to skiplists include structural modifications, as well as algorithmic enhancements and operations. We categorize the existing extensions, and summarize the skiplist variants that belong to each category. We present how data systems incorporate skiplist variants into many different application scenarios to serve various purposes. These data systems cover a wide range of applications, from data indexing to block-chain, from network algorithms to deterministic skiplists, etc. It illustrates an impactful and diverse applications of skiplists in various domains of data systems.

The Ubiquitous Skiplist: A Survey of What Cannot be Skipped About the Skiplist and its Applications in Big Data Systems

TL;DR

This survey systematically catalogs skiplist variants, their probabilistic foundations, and their deployment in modern big-data systems. It covers fundamental operations with their time behavior, then surveys concurrent, deterministic, and skew-aware designs, including MVCC and GPU-enabled skiplists. The paper also maps skiplists to related data structures (notably the B/B-tree) and details their integration with LSM-trees, KV stores, and complex data indexing (multi-dimensional and interval variants). It further explains hardware-conscious optimizations (cache locality, external memory, NUMA, NMP, GPU) and diverse applications from KV stores to networks and blockchain, highlighting the skiplist’s versatility and scalability. The work provides a forward-looking view on adapting skiplists to emerging hardware and workloads, and outlines open challenges such as ML-guided skiplist design and deeper deterministic- probabilistic trade-offs.

Abstract

Skiplists have become prevalent in systems. The main advantages of skiplists are their simplicity and ease of implementation, and the ability to support operations in the same asymptotic complexities as their tree-based counterparts. In this survey, we explore skiplists and their many variants. We highlight many scenarios about how skiplists are useful, and how they fit well in these usage scenarios. We also compare skiplists with other data structures, especially tree-based structures. Extensions to skiplists include structural modifications, as well as algorithmic enhancements and operations. We categorize the existing extensions, and summarize the skiplist variants that belong to each category. We present how data systems incorporate skiplist variants into many different application scenarios to serve various purposes. These data systems cover a wide range of applications, from data indexing to block-chain, from network algorithms to deterministic skiplists, etc. It illustrates an impactful and diverse applications of skiplists in various domains of data systems.
Paper Structure (49 sections, 17 figures, 1 table, 1 algorithm)

This paper contains 49 sections, 17 figures, 1 table, 1 algorithm.

Figures (17)

  • Figure 1: The linked list and the skiplist that skips every other data item in the linked list below.
  • Figure 2: The skiplist and its basic operations. (a) A skiplist with 7 keys, where the header and Nodes 21 and 25 have three forward pointers. The search path for Key 33 is illustrated by the blue dotted line; the search path for Key 15 is illustrated by the red double dashed line. (b) To insert a new key 23, Node 21 modifies its forward pointers in Level-1. (c) To delete 15, Node 12 modifies its forward pointers in Level-1.
  • Figure 3: The 1-2-3 skiplist and its counterpart tree structures
  • Figure 4: The lock-free linked list with two-step deletion. (a) A single CAS operation insertion. (b) A single CAS operation deletion. (c) A lost update. (d) Step 1 of the two-step deletion. (e) Step 2 of the two-step deletion
  • Figure 5: The 3-step deletion in a lock-free linked list ((a) to (c)) and (d) the lock-free skiplist with Three-Step Deletion fomitchev2004lock.
  • ...and 12 more figures