A Predictive Framework for Base-n Radix Sort Optimization
Atharv Pandey, Lakshmanan Kuppusamy
TL;DR
This work addresses the inefficiency of Base-$n$ Radix Sort (BNRS) caused by zero-padding in skewed data by introducing SP-LSD, a Stable Partitioning - Least Significant Digit Radix Sort that iteratively prunes the active region. The authors formalize the Radix Crossover Framework (RCF), a three-point decision model comprising the Asymptotic Crossover, Round-feasibility Crossover, and Pruning Crossover, to determine when SP-LSD outperforms both BNRS and comparison-based sorts. Through theoretical cost analyses and empirical benchmarks against Introsort, AFS, BNRS, and SP-LSD, they show that SP-LSD yields substantial gains on skewed and uniform-log distributions, with clear crossover boundaries guiding adaptive algorithm selection. The results suggest SP-LSD can achieve significant performance benefits on large-scale datasets, offering a practical pathway to adaptive, data-aware sorting in high-throughput systems.
Abstract
Sorting is a foundational primitive of computer science and optimizations in sorting subroutines can cascade into significant performance gains for high-throughput systems. In this paper, we analyze the inefficiencies of a non-comparison sorting algorithm, namely, Base-n Radix Sort (BNRS), specifically the `zero padding' problem in skewed datasets. We develop an execution model, called, Stable Partitioning - Least Significant Digit Radix Sort (shortly, SP-LSD), an iterative least significant digit based pruning model designed to address this inefficiency. Based on this development, we derive the Radix Crossover Framework(RCF), an analytic three-point decision framework. The framework is established on the precondition of non-negative integers, which enables the derivation of three critical boundaries. First, the Asymptotic Crossover ($k<n^{\log_2 n}$) defines when BNRS and SP-LSD can theoretically outperform the comparison sorting algorithms where k is the maximum value and n is the input size. Second, the Round-feasibility Crossover ($k>n^2$) defines when overhead cost of implemented model SP-LSD is amortized. Third, we derive Pruning Crossover parameterized by the ratio of random-access sorting cost to sequential partitioning cost. This model demonstrates that SP-LSD yields a net gain on skewed and uniform distributions over standard BNRS. The experimental results are consistent with the crossover boundaries, providing a deterministic roadmap for adaptive algorithm selection.
