Parallelizing the Computation of Robustness for Measuring the Strength of Tuples
Davide Martinenghi
TL;DR
This work tackles computing grid resistance, a robustness indicator, for skyline tuples by adapting established partitioning-based parallel skyline algorithms. It formalizes skyline dominance, grid projection, and grid-resistance, and analyzes three partitioning strategies (Grid, Angular, Sliced) along with a practical algorithm to compute $\mathsf{gres}$ that repeatedly recomputes skylines on grid-projected data. Through extensive experiments on synthetic and real datasets, it demonstrates that parallelization yields meaningful speedups, with Sliced often providing the most stable performance, and that overly aggressive partitioning or representative filtering offers limited benefits. The results suggest practical guidelines for implementing parallel robustness computations and underscore potential extensions to other dominance-based indicators.
Abstract
Several indicators have been recently proposed for measuring various characteristics of the tuples of a dataset -- particularly, the so-called skyline tuples, i.e., those that are not dominated by other tuples. Numeric indicators are very important as they may, e.g., provide an additional criterion to be used to rank skyline tuples and focus on a subset thereof. We concentrate on an indicator of robustness that may be measured for any skyline tuple $t$: grid resistance, i.e., how large value perturbations can be tolerated for $t$ to remain non-dominated (and thus in the skyline). The computation of this indicator typically involves one or more rounds of computation of the skyline itself or, at least, of dominance relationships. Building on recent advances in partitioning strategies allowing a parallel computation of skylines, we discuss how these strategies can be adapted to the computation of the indicator.
