TwinArray Sort: An Ultrarapid Conditional Non-Comparison Based Sorting Algorithm
Amin Amini
TL;DR
This paper tackles the need for ultrafast sorting on large datasets beyond traditional comparison-based algorithms. It introduces TwinArray Sort, a conditional non-comparison-based algorithm that uses dual auxiliary arrays for value storage and frequency counting along with a conditional distinct array verifier to handle duplicates, achieving worst-case time and space complexity of $O(n+k)$. Experimental results indicate TwinArray Sort outperforms several non-comparison-based methods and many traditional sorts across random, reverse, and nearly sorted distributions, especially for unique-element datasets. The analysis discusses the impact of the value range $k$ on performance, noting linear growth in time and memory when $k$ is large relative to $n$, and suggests future work to mitigate this limitation, confirming its potential for large-scale data processing.
Abstract
In computer science, sorting algorithms are crucial for data processing and machine learning. Large datasets and high efficiency requirements provide challenges for comparison-based algorithms like Quicksort and Merge sort, which achieve O(n log n) time complexity. Non-comparison-based algorithms like Spreadsort and Counting Sort have memory consumption issues and a relatively high computational demand, even if they can attain linear time complexity under certain circumstances. We present TwinArray Sort, a novel conditional non-comparison-based sorting algorithm that effectively uses array indices. When it comes to worst-case time and space complexities, TwinArray Sort achieves O(n+k). The approach remains efficient under all settings and works well with datasets with randomly sorted, reverse-sorted, or nearly sorted distributions. TwinArray Sort can handle duplicates and optimize memory efficiently since thanks to its two auxiliary arrays for value storage and frequency counting, as well as a conditional distinct array verifier. TwinArray Sort constantly performs better than conventional algorithms, according to experimental assessments and particularly when sorting unique arrays under all data distribution scenarios. The approach is suitable for massive data processing and machine learning dataset management due to its creative use of dual auxiliary arrays and a conditional distinct array verification, which improves memory use and duplication handling. TwinArray Sort overcomes conventional sorting algorithmic constraints by combining cutting-edge methods with non-comparison-based sorting advantages. Its reliable performance in a range of data distributions makes it an adaptable and effective answer for contemporary computing requirements.
