Table of Contents
Fetching ...

PCF Learned Sort: a Learning Augmented Sort Algorithm with $O(n \log\log n)$ Expected Complexity

Atsuki Sato, Yusuke Matsui

TL;DR

This work addresses the lack of theoretical guarantees for Learned Sort by introducing PCF Learned Sort, a learning-augmented non-division-based sorter that uses a Piecewise Constant Function CDF model to partition data. It provides rigorous worst-case and expected-time analyses: a worst-case bound of $O(nU(n) + n\log\log n)$ when paired with a general internal sort, and an expected bound of $O(n\log\log n)$ under mild distributional assumptions, with $\delta=\lfloor n^d\rfloor$ for $0<d<1$. The framework is empirically validated on synthetic and real datasets, demonstrating $O(n\log\log n)$ behavior and robustness against variance in data distributions, while highlighting the practical trade-offs with fully optimized, non-guaranteed learned sorts. The results advance the understanding of why Learned Sorts can outperform traditional sorts while ensuring stability and predictability in runtime across diverse inputs.

Abstract

Sorting is one of the most fundamental algorithms in computer science. Recently, Learned Sorts, which use machine learning to improve sorting speed, have attracted attention. While existing studies show that Learned Sort is empirically faster than classical sorting algorithms, they do not provide theoretical guarantees about its computational complexity. We propose Piecewise Constant Function (PCF) Learned Sort, a theoretically guaranteed Learned Sort algorithm. We prove that the expected complexity of PCF Learned Sort is $\mathcal{O}(n \log \log n)$ under mild assumptions on the data distribution. We also confirm empirically that PCF Learned Sort has a computational complexity of $\mathcal{O}(n \log \log n)$ on both synthetic and real datasets. This is the first study to theoretically support the empirical success of Learned Sort, and provides evidence for why Learned Sort is fast. The code is available at https://github.com/atsukisato/PCF_Learned_Sort .

PCF Learned Sort: a Learning Augmented Sort Algorithm with $O(n \log\log n)$ Expected Complexity

TL;DR

This work addresses the lack of theoretical guarantees for Learned Sort by introducing PCF Learned Sort, a learning-augmented non-division-based sorter that uses a Piecewise Constant Function CDF model to partition data. It provides rigorous worst-case and expected-time analyses: a worst-case bound of when paired with a general internal sort, and an expected bound of under mild distributional assumptions, with for . The framework is empirically validated on synthetic and real datasets, demonstrating behavior and robustness against variance in data distributions, while highlighting the practical trade-offs with fully optimized, non-guaranteed learned sorts. The results advance the understanding of why Learned Sorts can outperform traditional sorts while ensuring stability and predictability in runtime across diverse inputs.

Abstract

Sorting is one of the most fundamental algorithms in computer science. Recently, Learned Sorts, which use machine learning to improve sorting speed, have attracted attention. While existing studies show that Learned Sort is empirically faster than classical sorting algorithms, they do not provide theoretical guarantees about its computational complexity. We propose Piecewise Constant Function (PCF) Learned Sort, a theoretically guaranteed Learned Sort algorithm. We prove that the expected complexity of PCF Learned Sort is under mild assumptions on the data distribution. We also confirm empirically that PCF Learned Sort has a computational complexity of on both synthetic and real datasets. This is the first study to theoretically support the empirical success of Learned Sort, and provides evidence for why Learned Sort is fast. The code is available at https://github.com/atsukisato/PCF_Learned_Sort .
Paper Structure (36 sections, 13 theorems, 51 equations, 7 figures, 1 table, 1 algorithm)

This paper contains 36 sections, 13 theorems, 51 equations, 7 figures, 1 table, 1 algorithm.

Key Result

Lemma 3.1

Assume that there exists a model-based bucketing algorithm $\mathcal{M}$ such that $\mathcal{M}$ can perform bucketing (including model training and inferences) an array of length $n$ into $\gamma + 1 = \mathcal{O}(n)$ buckets with a worst-case complexity of $\mathcal{O}(n)$. Also, assume that the s

Figures (7)

  • Figure 1: PCF Learned Sort: First, the input array is partitioned into $\gamma+1$ buckets using a CDF model-based method. Buckets larger than $\delta$ or smaller than $\tau$ are sorted with a standard sort (e.g., IntroSort or QuickSort). Otherwise, the recursive model-based bucketing is repeated. Finally, the sorted arrays are concatenated. The CDF model used for bucketing is a Piecewise Constant Function (PCF). The function is constant within each interval, and the interval widths are constant.
  • Figure 2: Number of operations to sort the array. Below each graph is a histogram visualizing the distribution of each dataset. The standard deviation of the $10$ measurements is represented by the shaded area. Our PCF Learned Sort consistently achieves a complexity lower than $\mathcal{O}(n \log n)$, while Learned Sort 2.0 kristo2021defeating, which has $\mathcal{O}(n^2)$ worst-case complexity, occasionally requires huge operations.
  • Figure 3: Heatmap showing the empirical frequency of bucketing failure, i.e., $\exists j, |\bm{c}_j| > \delta$. The variables $a,b,c,d$, except those on the x- and y-axes, were set to $0.75$. The white dotted line represents the parameters that make the right side of \ref{['equ: pcf divide success prob']} equal to 0.5. The close alignment between this white dotted line and the actual success/fail boundery suggests the theoretical bound by is \ref{['equ: pcf divide success prob']} reasonably tight.
  • Figure 4: Time to sort the array. The standard deviation of the $10$ measurements is represented by the shaded area. Our PCF Learned Sort is significantly faster than std::sort, and more importantly, it maintains robust performance across all datasets, whereas other learned sorts without worst-case guarantees can suffer catastrophic slowdowns, as seen with Learned Sort 2.0 on the SOF [Temperature] dataset.
  • Figure 5: Time to sort the array in adversarial environments. Below each graph is a histogram that visualizes the distribution of each dataset. The standard deviation of the $10$ measurements is represented by the shaded area.
  • ...and 2 more figures

Theorems & Definitions (27)

  • Lemma 3.1
  • Lemma 3.3
  • Lemma 3.4
  • Theorem 3.5
  • proof
  • Theorem 3.6
  • proof
  • proof
  • proof
  • Lemma A.1
  • ...and 17 more