Table of Contents
Fetching ...

Anytime Sorting Algorithms (Extended Version)

Emma Caizergues, François Durand, Fabien Mathieu

TL;DR

This work formalizes the problem of anytime sorting, where intermediate estimates of a sorted list must be produced after each $k$ comparisons and evaluated via Spearman's footrule. It introduces estimators that convert any comparison-based sort into an anytime algorithm, and proposes two new methods, Multizip sort and Corsort, to improve intermediate estimates and termination efficiency. Through extensive simulations, Corsort achieves a quasi-optimal performance profile with strong early estimates, while Multizip sort offers low termination overhead and robust intermediate results; ASort remains competitive for median-focused scenarios but with higher variance. The work provides open-source tooling and discusses practical considerations, highlighting Corsort as a strong empirical benchmark for future anytime sorting research, while noting the lack of a formal proof for Corsort’s average-case complexity and the extendability of the estimators to other sorting paradigms.

Abstract

This paper addresses the anytime sorting problem, aiming to develop algorithms providing tentative estimates of the sorted list at each execution step. Comparisons are treated as steps, and the Spearman's footrule metric evaluates estimation accuracy. We propose a general approach for making any sorting algorithm anytime and introduce two new algorithms: multizip sort and Corsort. Simulations showcase the superior performance of both algorithms compared to existing methods. Multizip sort keeps a low global complexity, while Corsort produces intermediate estimates surpassing previous algorithms.

Anytime Sorting Algorithms (Extended Version)

TL;DR

This work formalizes the problem of anytime sorting, where intermediate estimates of a sorted list must be produced after each comparisons and evaluated via Spearman's footrule. It introduces estimators that convert any comparison-based sort into an anytime algorithm, and proposes two new methods, Multizip sort and Corsort, to improve intermediate estimates and termination efficiency. Through extensive simulations, Corsort achieves a quasi-optimal performance profile with strong early estimates, while Multizip sort offers low termination overhead and robust intermediate results; ASort remains competitive for median-focused scenarios but with higher variance. The work provides open-source tooling and discusses practical considerations, highlighting Corsort as a strong empirical benchmark for future anytime sorting research, while noting the lack of a formal proof for Corsort’s average-case complexity and the extendability of the estimators to other sorting paradigms.

Abstract

This paper addresses the anytime sorting problem, aiming to develop algorithms providing tentative estimates of the sorted list at each execution step. Comparisons are treated as steps, and the Spearman's footrule metric evaluates estimation accuracy. We propose a general approach for making any sorting algorithm anytime and introduce two new algorithms: multizip sort and Corsort. Simulations showcase the superior performance of both algorithms compared to existing methods. Multizip sort keeps a low global complexity, while Corsort produces intermediate estimates surpassing previous algorithms.
Paper Structure (25 sections, 2 equations, 7 figures, 2 tables, 5 algorithms)

This paper contains 25 sections, 2 equations, 7 figures, 2 tables, 5 algorithms.

Figures (7)

  • Figure 1: Sorting the list $X=(51872643)$ with top-down merge, bottom-up merge and multizip sort. Each edge represents a comparison. The bracket notation $[\;\mid\;]$ delimits two sublists already sorted that are being merged. For more details, cf. Appendix \ref{['sec:detailed_execution_multizip']}.
  • Figure 2: Sorting the list $X=(3246715)$ with quicksort or ASort. A node appears in bold if it has already been used as a pivot: it partitions the list into smaller elements on the left and larger ones on the right. Intermediate steps that do not perform any comparison are omitted. For ASort (\ref{['fig:asort_exec']}), we use quickselect hoare1961algorithm as median identification subroutine. Each step represents the application of a pivot, and each edge represents a comparison. A node is circled twice if it has been identified as a median. In the third step, since the median 4 has been found, we must compute the median of the left sub-list $(213)$; but note that it is useless to make any comparison with element 3, previously used as a pivot, because it is is already in its final position.
  • Figure 3: Example of score-based selection of a linear extension. The input partial order, represented by the edges of the graph, is the transitive closure of: $a\prec b$, $c\prec d \prec \ldots \prec n \prec o$, $n\prec p \prec q\text{.}$ Heights are proportional to scores. Returned estimates and expectation $\overline{S}$ of the error $S$ are also provided for completeness.
  • Figure 4: Execution of Corsort-$\Delta$ on the list $X=(42315)$. Each step $k$ depicts the partial order after $k$ comparisons. The next comparison to perform is visually highlighted by the two vertices whose values of $\Delta_k$ (displayed above each element) are in red. Each subfigure represents $\rho_k$ as each element's height and gives the corresponding estimate $X_k$ based on $\rho_k$ and the associated error $S_k$.
  • Figure 5: Number of comparisons required for algorithm termination, as a relative overhead compared to the information-theoretic lower bound. Each point is obtained by sorting $10,000$ random lists.
  • ...and 2 more figures