Table of Contents
Fetching ...

A Predictive Framework for Base-n Radix Sort Optimization

Atharv Pandey, Lakshmanan Kuppusamy

TL;DR

This work addresses the inefficiency of Base-$n$ Radix Sort (BNRS) caused by zero-padding in skewed data by introducing SP-LSD, a Stable Partitioning - Least Significant Digit Radix Sort that iteratively prunes the active region. The authors formalize the Radix Crossover Framework (RCF), a three-point decision model comprising the Asymptotic Crossover, Round-feasibility Crossover, and Pruning Crossover, to determine when SP-LSD outperforms both BNRS and comparison-based sorts. Through theoretical cost analyses and empirical benchmarks against Introsort, AFS, BNRS, and SP-LSD, they show that SP-LSD yields substantial gains on skewed and uniform-log distributions, with clear crossover boundaries guiding adaptive algorithm selection. The results suggest SP-LSD can achieve significant performance benefits on large-scale datasets, offering a practical pathway to adaptive, data-aware sorting in high-throughput systems.

Abstract

Sorting is a foundational primitive of computer science and optimizations in sorting subroutines can cascade into significant performance gains for high-throughput systems. In this paper, we analyze the inefficiencies of a non-comparison sorting algorithm, namely, Base-n Radix Sort (BNRS), specifically the `zero padding' problem in skewed datasets. We develop an execution model, called, Stable Partitioning - Least Significant Digit Radix Sort (shortly, SP-LSD), an iterative least significant digit based pruning model designed to address this inefficiency. Based on this development, we derive the Radix Crossover Framework(RCF), an analytic three-point decision framework. The framework is established on the precondition of non-negative integers, which enables the derivation of three critical boundaries. First, the Asymptotic Crossover ($k<n^{\log_2 n}$) defines when BNRS and SP-LSD can theoretically outperform the comparison sorting algorithms where k is the maximum value and n is the input size. Second, the Round-feasibility Crossover ($k>n^2$) defines when overhead cost of implemented model SP-LSD is amortized. Third, we derive Pruning Crossover parameterized by the ratio of random-access sorting cost to sequential partitioning cost. This model demonstrates that SP-LSD yields a net gain on skewed and uniform distributions over standard BNRS. The experimental results are consistent with the crossover boundaries, providing a deterministic roadmap for adaptive algorithm selection.

A Predictive Framework for Base-n Radix Sort Optimization

TL;DR

This work addresses the inefficiency of Base- Radix Sort (BNRS) caused by zero-padding in skewed data by introducing SP-LSD, a Stable Partitioning - Least Significant Digit Radix Sort that iteratively prunes the active region. The authors formalize the Radix Crossover Framework (RCF), a three-point decision model comprising the Asymptotic Crossover, Round-feasibility Crossover, and Pruning Crossover, to determine when SP-LSD outperforms both BNRS and comparison-based sorts. Through theoretical cost analyses and empirical benchmarks against Introsort, AFS, BNRS, and SP-LSD, they show that SP-LSD yields substantial gains on skewed and uniform-log distributions, with clear crossover boundaries guiding adaptive algorithm selection. The results suggest SP-LSD can achieve significant performance benefits on large-scale datasets, offering a practical pathway to adaptive, data-aware sorting in high-throughput systems.

Abstract

Sorting is a foundational primitive of computer science and optimizations in sorting subroutines can cascade into significant performance gains for high-throughput systems. In this paper, we analyze the inefficiencies of a non-comparison sorting algorithm, namely, Base-n Radix Sort (BNRS), specifically the `zero padding' problem in skewed datasets. We develop an execution model, called, Stable Partitioning - Least Significant Digit Radix Sort (shortly, SP-LSD), an iterative least significant digit based pruning model designed to address this inefficiency. Based on this development, we derive the Radix Crossover Framework(RCF), an analytic three-point decision framework. The framework is established on the precondition of non-negative integers, which enables the derivation of three critical boundaries. First, the Asymptotic Crossover () defines when BNRS and SP-LSD can theoretically outperform the comparison sorting algorithms where k is the maximum value and n is the input size. Second, the Round-feasibility Crossover () defines when overhead cost of implemented model SP-LSD is amortized. Third, we derive Pruning Crossover parameterized by the ratio of random-access sorting cost to sequential partitioning cost. This model demonstrates that SP-LSD yields a net gain on skewed and uniform distributions over standard BNRS. The experimental results are consistent with the crossover boundaries, providing a deterministic roadmap for adaptive algorithm selection.

Paper Structure

This paper contains 44 sections, 8 equations, 2 figures, 8 tables, 2 algorithms.

Figures (2)

  • Figure 1:
  • Figure 2: