Table of Contents
Fetching ...

Computing All Restricted Skyline Probabilities on Uncertain Datasets

Xiangyu Gao, Jianzhong Li, Dongjing Miao

TL;DR

This work studies computing all restricted skyline probabilities (ARSP) on uncertain datasets under a personalized, monotone scoring function family $\\mathcal{F}$. It first establishes a conditional lower bound via the Orthogonal Vectors conjecture, then develops efficient ARSP algorithms for linear weight constraints, including a near-optimal mapping-based approach and a practical branch-and-bound variant. It also presents sublinear-time techniques under weight-ratio constraints by reducing dominance tests to half-space reporting, with multi-level and shift strategies offering different trade-offs. Extensive experiments on real and synthetic data demonstrate ARSP’s effectiveness and the scalability of the proposed algorithms relative to traditional aggregated skyline approaches. The results highlight ARSP’s potential to provide richer, user-tailored decision support in uncertain multi-criteria settings.

Abstract

Restricted skyline (rskyline) query is widely used in multi-criteria decision making. It generalizes the skyline query by additionally considering a set of personalized scoring functions F. Since uncertainty is inherent in datasets for multi-criteria decision making, we study rskyline queries on uncertain datasets from both complexity and algorithm perspective. We formalize the problem of computing rskyline probabilities of all data items and show that no algorithm can solve this problem in truly subquadratic-time, unless the orthogonal vectors conjecture fails. Considering that linear scoring functions are widely used in practical applications, we propose two efficient algorithms for the case where $\calF$ is a set of linear scoring functions whose weights are described by linear constraints, one with near-optimal time complexity and the other with better expected time complexity. For special linear constraints involving a series of weight ratios, we further devise an algorithm with sublinear query time and polynomial preprocessing time. Extensive experiments demonstrate the effectiveness, efficiency, scalability, and usefulness of our proposed algorithms.

Computing All Restricted Skyline Probabilities on Uncertain Datasets

TL;DR

This work studies computing all restricted skyline probabilities (ARSP) on uncertain datasets under a personalized, monotone scoring function family . It first establishes a conditional lower bound via the Orthogonal Vectors conjecture, then develops efficient ARSP algorithms for linear weight constraints, including a near-optimal mapping-based approach and a practical branch-and-bound variant. It also presents sublinear-time techniques under weight-ratio constraints by reducing dominance tests to half-space reporting, with multi-level and shift strategies offering different trade-offs. Extensive experiments on real and synthetic data demonstrate ARSP’s effectiveness and the scalability of the proposed algorithms relative to traditional aggregated skyline approaches. The results highlight ARSP’s potential to provide richer, user-tailored decision support in uncertain multi-criteria settings.

Abstract

Restricted skyline (rskyline) query is widely used in multi-criteria decision making. It generalizes the skyline query by additionally considering a set of personalized scoring functions F. Since uncertainty is inherent in datasets for multi-criteria decision making, we study rskyline queries on uncertain datasets from both complexity and algorithm perspective. We formalize the problem of computing rskyline probabilities of all data items and show that no algorithm can solve this problem in truly subquadratic-time, unless the orthogonal vectors conjecture fails. Considering that linear scoring functions are widely used in practical applications, we propose two efficient algorithms for the case where is a set of linear scoring functions whose weights are described by linear constraints, one with near-optimal time complexity and the other with better expected time complexity. For special linear constraints involving a series of weight ratios, we further devise an algorithm with sublinear query time and polynomial preprocessing time. Extensive experiments demonstrate the effectiveness, efficiency, scalability, and usefulness of our proposed algorithms.
Paper Structure (19 sections, 7 theorems, 8 equations, 10 figures, 2 tables, 2 algorithms)

This paper contains 19 sections, 7 theorems, 8 equations, 10 figures, 2 tables, 2 algorithms.

Key Result

Theorem 1

Given an uncertain dataset $\mathcal{D}$ and a set of monotone scoring functions $\mathcal{F}$, no algorithm can compute rskyline probabilities of all instances within $O(n^{2-\delta})$ time for any $\delta > 0$, unless the Orthogonal Vectors conjecture fails.

Figures (10)

  • Figure 1: An uncertain dataset $\mathcal{D}$ of 4 objects and 10 instances.
  • Figure 2: Running example for $kd$-ASP$^*$.
  • Figure 3: An illustration of the reduction to half-space reporting problem and performing point location queries in dual space.
  • Figure 4: Boxplots of players' scores under $\omega_1 = (1, 0, 0)$, $\omega_2 = (1/2, 1/2, 0)$, and $\omega_3 = (1/3, 1/3, 1/3)$, where average is marked with red dotted lines.
  • Figure 5: Running time of different algorithms and the size of ARSP on synthetic datasets.
  • ...and 5 more figures

Theorems & Definitions (12)

  • Example 1
  • Theorem 1
  • proof
  • Theorem 2: $\mathcal{F}$-dominance test DBLP:journals/pvldb/CiacciaM17
  • Example 2
  • Theorem 3
  • Theorem 4
  • Lemma 1
  • Theorem 5: Efficient $\mathcal{F}$-dominance test
  • Example 3
  • ...and 2 more