Table of Contents
Fetching ...

Preference Queries over Taxonomic Domains

Paolo Ciaccia, Davide Martinenghi, Riccardo Torlone

TL;DR

The paper tackles retrieving the best data items when user preferences are expressed over taxonomic domains and may clash or be at different granularities. It models data as t-relations with taxonomies and expresses preferences through first-order formulas, then rewrites them with two operators: Transitive Closure $\textsf{T}$ to ensure sound transitivity and Specificity-based Refinement $\textsf{S}$ to resolve conflicts. It proves that $\textsf{T}$ and $\textsf{S}$ do not commute and that no single sequence guarantees both full transitivity and conflict-free results, but identifies two minimal-transitive sequences, $\textsf{T}\textsf{S}\textsf{T}$ and $\textsf{S}\textsf{T}\textsf{S}\textsf{T}$, along with a heuristic method to select the best outcomes. Experiments on synthetic and real datasets show low rewriting overhead, substantial pruning of the candidate set, and important speedups when using the proposed heuristics, validating the practical feasibility of the approach. The work advances query processing over taxonomic domains by providing a principled treatment of specificity and transitivity in preferences and a scalable strategy for best-result computation.

Abstract

When composing multiple preferences characterizing the most suitable results for a user, several issues may arise. Indeed, preferences can be partially contradictory, suffer from a mismatch with the level of detail of the actual data, and even lack natural properties such as transitivity. In this paper we formally investigate the problem of retrieving the best results complying with multiple preferences expressed in a logic-based language. Data are stored in relational tables with taxonomic domains, which allow the specification of preferences also over values that are more generic than those in the database. In this framework, we introduce two operators that rewrite preferences for enforcing the important properties of transitivity, which guarantees soundness of the result, and specificity, which solves all conflicts among preferences. Although, as we show, these two properties cannot be fully achieved together, we use our operators to identify the only two alternatives that ensure transitivity and minimize the residual conflicts. Building on this finding, we devise a technique, based on an original heuristics, for selecting the best results according to the two possible alternatives. We finally show, with a number of experiments over both synthetic and real-world datasets, the effectiveness and practical feasibility of the overall approach.

Preference Queries over Taxonomic Domains

TL;DR

The paper tackles retrieving the best data items when user preferences are expressed over taxonomic domains and may clash or be at different granularities. It models data as t-relations with taxonomies and expresses preferences through first-order formulas, then rewrites them with two operators: Transitive Closure to ensure sound transitivity and Specificity-based Refinement to resolve conflicts. It proves that and do not commute and that no single sequence guarantees both full transitivity and conflict-free results, but identifies two minimal-transitive sequences, and , along with a heuristic method to select the best outcomes. Experiments on synthetic and real datasets show low rewriting overhead, substantial pruning of the candidate set, and important speedups when using the proposed heuristics, validating the practical feasibility of the approach. The work advances query processing over taxonomic domains by providing a principled treatment of specificity and transitivity in preferences and a scalable strategy for best-result computation.

Abstract

When composing multiple preferences characterizing the most suitable results for a user, several issues may arise. Indeed, preferences can be partially contradictory, suffer from a mismatch with the level of detail of the actual data, and even lack natural properties such as transitivity. In this paper we formally investigate the problem of retrieving the best results complying with multiple preferences expressed in a logic-based language. Data are stored in relational tables with taxonomic domains, which allow the specification of preferences also over values that are more generic than those in the database. In this framework, we introduce two operators that rewrite preferences for enforcing the important properties of transitivity, which guarantees soundness of the result, and specificity, which solves all conflicts among preferences. Although, as we show, these two properties cannot be fully achieved together, we use our operators to identify the only two alternatives that ensure transitivity and minimize the residual conflicts. Building on this finding, we devise a technique, based on an original heuristics, for selecting the best results according to the two possible alternatives. We finally show, with a number of experiments over both synthetic and real-world datasets, the effectiveness and practical feasibility of the overall approach.
Paper Structure (21 sections, 24 theorems, 27 equations, 11 figures, 1 table, 2 algorithms)

This paper contains 21 sections, 24 theorems, 27 equations, 11 figures, 1 table, 2 algorithms.

Key Result

Lemma 1

A preference statement $P_i(x,y)$ is more specific than $P_j(y,x)$ iff $P_i(x,y)$ implies $P_j(y,x)$ (written $P_i(x,y) \rightarrow P_j(y,x)$) and the opposite does not hold.The hypothesis that $P_j(y,x)$ does not imply $P_i(x,y)$ excludes the case of opposite preference statements (e.g., white is b

Figures (11)

  • Figure 1: A list of wines
  • Figure 2: Taxonomies for the running example.
  • Figure 3: A set of wines for Example \ref{['ex:non-transitive-algorithm']}.
  • Figure 4: A transitively reduced graph showing containment between sequences. Dashed border for incomplete sequences; grey background for non-transitive sequences; blue background for minimal-transitive sequences. All containment relationships are strict.
  • Figure 5: Time for computing the formula: various settings.
  • ...and 6 more figures

Theorems & Definitions (57)

  • Example 1
  • Definition 1: Taxonomy
  • Example 2
  • Definition 2: t-relation, t-schema, t-tuple
  • Example 3
  • Definition 3: Preference relation
  • Definition 4: Incomparability and Indifference
  • Example 4
  • Definition 5: Best operator
  • Example 5
  • ...and 47 more