Robust Learning-Augmented Dictionaries

Ali Zeynali; Shahin Kamali; Mohammad Hajiesmaili

Robust Learning-Augmented Dictionaries

Ali Zeynali, Shahin Kamali, Mohammad Hajiesmaili

TL;DR

The paper tackles the problem of designing dictionary data structures that are simultaneously statically optimal and robust to adversarial predictions. It introduces RobustSL, a skip-list implementation that fuses an optimistic (learning-augmented) treap with a pessimistic balanced BST to achieve $\mathcal{O}(m H(p))$ consistency when predictions are accurate and $\mathcal{O}(m \log n)$ robustness in general. The static and dynamic variants, RobustSL_s and RobustSL_d, are proven to provide optimal consistency and robustness, with dynamic operations supported in $O(\log n)$ time and total costs containment under varying prediction quality. Comprehensive experiments on synthetic and real data corroborate the theory, showing RobustSL outperforms prior learned structures under noise and remains competitive when predictions are reliable, highlighting its practical impact for real-world workloads with imperfect predictions.

Abstract

We present the first learning-augmented data structure for implementing dictionaries with optimal consistency and robustness. Our data structure, named RobustSL, is a skip list augmented by predictions of access frequencies of elements in a data sequence. With proper predictions, RobustSL has optimal consistency (achieves static optimality). At the same time, it maintains a logarithmic running time for each operation, ensuring optimal robustness, even if predictions are generated adversarially. Therefore, RobustSL has all the advantages of the recent learning-augmented data structures of Lin, Luo, and Woodruff (ICML 2022) and Cao et al. (arXiv 2023), while providing robustness guarantees that are absent in the previous work. Numerical experiments show that RobustSL outperforms alternative data structures using both synthetic and real datasets.

Robust Learning-Augmented Dictionaries

TL;DR

consistency when predictions are accurate and

robustness in general. The static and dynamic variants, RobustSL_s and RobustSL_d, are proven to provide optimal consistency and robustness, with dynamic operations supported in

time and total costs containment under varying prediction quality. Comprehensive experiments on synthetic and real data corroborate the theory, showing RobustSL outperforms prior learned structures under noise and remains competitive when predictions are reliable, highlighting its practical impact for real-world workloads with imperfect predictions.

Abstract

Paper Structure (11 sections, 12 theorems, 8 equations, 4 figures)

This paper contains 11 sections, 12 theorems, 8 equations, 4 figures.

Introduction
Preliminaries
Dictionaries and Optimality
Treaps and Skip lists
On the Robustness of Learning-augmented Treaps
Combining BSTs into a Skiplist
Consistent and Robust Dictionaries
Robust and Consistent Dynamic Dictionary
Experiments
Experimental Setup and Overview
Experimental Results

Key Result

Proposition 1

The consistency of learning-augmented Treaps of LinLWood22 for dictionaries of size $n$ is at least $\Omega(n /\log n)$, and their robustness is $n$.

Figures (4)

Figure 1: Combining two BSTs $T_o$ and $T_p$ into a skip list $L(T_o,T_p)$. The search path and the nodes at which a comparison is made for $\texttt{Search}(6)$ are highlighted.
Figure 2: Average number of comparisons per query for RobustSL and baseline data structures for static evaluations under (left) random frequency ordering with perfect predictions, and (right) adversary frequency ordering with noisy predictions. Results show that learning-augmented treaps are very sensitive to the accuracy of prediction and the ordering of items while RobustSL shows it robustness and consistency.
Figure 3: Average number of comparisons per query for RobustSL and baseline data structures for dynamic evaluations under (left) random frequency ordering with perfect predictions, and (right) adversary frequency ordering with noisy predictions. Similar to the result of static experiments, the frequency ordering of items impacts the performance of learning-augmented treaps while RobustSL shows its robustness against different conditions.
Figure 4: The figure depicts the average number of comparisons per query for both RobustSL and the baseline data structures. The evaluation is conducted for two scenarios: (left) across different dataset categories and (right) varying sizes of the adversary dataset. When examining different categories, the size of the adversary dataset remains fixed at $25\%$ of the training dataset. Notably, the performance of learning-augmented treaps demonstrates a linear degradation as the size of the adversary dataset increases.

Theorems & Definitions (21)

Definition 1
Proposition 1
proof
Proposition 2
proof
Proposition 3
Proposition 4
proof
Theorem 1
proof
...and 11 more

Robust Learning-Augmented Dictionaries

TL;DR

Abstract

Robust Learning-Augmented Dictionaries

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (21)