SquareSort: a cache-oblivious sorting algorithm

Michal Koucký; Josef Matějka

SquareSort: a cache-oblivious sorting algorithm

Michal Koucký, Josef Matějka

TL;DR

SquareSort addresses cache-oblivious sorting in the external memory model by structuring the input as a $\sqrt{n}\times\sqrt{n}$ matrix, recursively sorting columns, applying a SkewTranspose to bucketize elements, and then sorting the buckets. The core novelty is the SkewTranspose, which partitions data into buckets using random pivots and reorganizes elements to achieve near-optimal IO complexity $O\left(\frac{n}{B}\log_{M/B} n\right)$ under the tall-cache regime $M \ge B^2$. The authors provide a detailed recurrence-based analysis and probabilistic bucket-size bounds to establish the main IO bound, along with an experimental comparison showing competitive performance relative to std::sort and FunnelSort. The work contributes a conceptually simple cache-oblivious sorting approach with supporting theoretical and empirical evaluation, highlighting the practical viability of skew-based distribution-sort techniques in hierarchical memory systems.

Abstract

In this paper we consider sorting in the cache-oblivious model of Frigo, Leiserson, Prokop, and Ramachandran (1999). We introduce a new simple sorting algorithm in that model which has asymptotically optimal IO complexity $O(\frac{n}{B} \log_{M/B} n)$, where $n$ is the instance size, $M$ size of the cache and $B$ size of a memory block. This is the same as the complexity of the best known cache-oblivious sorting algorithm FunnelSort.

SquareSort: a cache-oblivious sorting algorithm

TL;DR

SquareSort addresses cache-oblivious sorting in the external memory model by structuring the input as a

matrix, recursively sorting columns, applying a SkewTranspose to bucketize elements, and then sorting the buckets. The core novelty is the SkewTranspose, which partitions data into buckets using random pivots and reorganizes elements to achieve near-optimal IO complexity

under the tall-cache regime

. The authors provide a detailed recurrence-based analysis and probabilistic bucket-size bounds to establish the main IO bound, along with an experimental comparison showing competitive performance relative to std::sort and FunnelSort. The work contributes a conceptually simple cache-oblivious sorting approach with supporting theoretical and empirical evaluation, highlighting the practical viability of skew-based distribution-sort techniques in hierarchical memory systems.

Abstract

, where

is the instance size,

size of the cache and

size of a memory block. This is the same as the complexity of the best known cache-oblivious sorting algorithm FunnelSort.

Paper Structure (16 sections, 10 theorems, 18 equations, 6 figures, 3 algorithms)

This paper contains 16 sections, 10 theorems, 18 equations, 6 figures, 3 algorithms.

Introduction
Cache-oblivious analysis
Memory management within the external memory.
Our algorithm
Detailed description of SquareSort
Skew Transposition
Analysis of ${\mathrm{SkewTranspose}}$
Analysis of SquareSort
Establishing recurrence (\ref{['rec-IO']})
Analysis of expected bucket sizes
Proof of the main theorem
Experiments
Results
Cutoff
External sorting
...and 1 more sections

Key Result

Theorem 1.1

SquareSort of $n$ items uses $O(\frac{n}{B} \log_{M/B} n)$ IOs in expectation over its randomness.

Figures (6)

Figure 1: An illustration of the SquareSort algorithm.
Figure 2: Illustration of a call to ${\mathrm{SkewTranspose}}$. Pointers $col[i]$ and $buc[j]$ will advance during the procedure.
Figure 3: Time per item to sort a random permutation (left) and a random binary sequence (right).
Figure 4: Time per item to sort a random sequence of elements from the universe of size $n$ (left) and of size $\sqrt{n}$ (right).
Figure 5: Time per item to sort a random permutation with different cutoffs.
...and 1 more figures

Theorems & Definitions (19)

Theorem 1.1: Informal
Lemma 2.1
proof
Theorem 3.1
Proposition 3.2
proof
Proposition 3.3
proof
Proposition 3.4
proof
...and 9 more

SquareSort: a cache-oblivious sorting algorithm

TL;DR

Abstract

SquareSort: a cache-oblivious sorting algorithm

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (19)