Preference Queries over Taxonomic Domains

Paolo Ciaccia; Davide Martinenghi; Riccardo Torlone

Preference Queries over Taxonomic Domains

Paolo Ciaccia, Davide Martinenghi, Riccardo Torlone

TL;DR

The paper tackles retrieving the best data items when user preferences are expressed over taxonomic domains and may clash or be at different granularities. It models data as t-relations with taxonomies and expresses preferences through first-order formulas, then rewrites them with two operators: Transitive Closure $\textsf{T}$ to ensure sound transitivity and Specificity-based Refinement $\textsf{S}$ to resolve conflicts. It proves that $\textsf{T}$ and $\textsf{S}$ do not commute and that no single sequence guarantees both full transitivity and conflict-free results, but identifies two minimal-transitive sequences, $\textsf{T}\textsf{S}\textsf{T}$ and $\textsf{S}\textsf{T}\textsf{S}\textsf{T}$, along with a heuristic method to select the best outcomes. Experiments on synthetic and real datasets show low rewriting overhead, substantial pruning of the candidate set, and important speedups when using the proposed heuristics, validating the practical feasibility of the approach. The work advances query processing over taxonomic domains by providing a principled treatment of specificity and transitivity in preferences and a scalable strategy for best-result computation.

Abstract

When composing multiple preferences characterizing the most suitable results for a user, several issues may arise. Indeed, preferences can be partially contradictory, suffer from a mismatch with the level of detail of the actual data, and even lack natural properties such as transitivity. In this paper we formally investigate the problem of retrieving the best results complying with multiple preferences expressed in a logic-based language. Data are stored in relational tables with taxonomic domains, which allow the specification of preferences also over values that are more generic than those in the database. In this framework, we introduce two operators that rewrite preferences for enforcing the important properties of transitivity, which guarantees soundness of the result, and specificity, which solves all conflicts among preferences. Although, as we show, these two properties cannot be fully achieved together, we use our operators to identify the only two alternatives that ensure transitivity and minimize the residual conflicts. Building on this finding, we devise a technique, based on an original heuristics, for selecting the best results according to the two possible alternatives. We finally show, with a number of experiments over both synthetic and real-world datasets, the effectiveness and practical feasibility of the overall approach.

Preference Queries over Taxonomic Domains

TL;DR

to ensure sound transitivity and Specificity-based Refinement

to resolve conflicts. It proves that

and

do not commute and that no single sequence guarantees both full transitivity and conflict-free results, but identifies two minimal-transitive sequences,

and

, along with a heuristic method to select the best outcomes. Experiments on synthetic and real datasets show low rewriting overhead, substantial pruning of the candidate set, and important speedups when using the proposed heuristics, validating the practical feasibility of the approach. The work advances query processing over taxonomic domains by providing a principled treatment of specificity and transitivity in preferences and a scalable strategy for best-result computation.

Abstract

Paper Structure (21 sections, 24 theorems, 27 equations, 11 figures, 1 table, 2 algorithms)

This paper contains 21 sections, 24 theorems, 27 equations, 11 figures, 1 table, 2 algorithms.

Introduction
Preliminaries
Data Model
Preference Model
Operations on Preferences
Transitive Closure
Specificity-based Refinement
Minimal-Transitive Sequences
Basic properties
The space of possible sequences
Minimality and transitivity
Computing the Best Results
Worst-case difference between $\textsf{\upshape{T}}\textsf{\upshape{S}}\textsf{\upshape{T}}$ and $\textsf{\upshape{S}}\textsf{\upshape{T}}\textsf{\upshape{S}}\textsf{\upshape{T}}$
A heuristics for computing the best results
Experiments
...and 6 more sections

Key Result

Lemma 1

A preference statement $P_i(x,y)$ is more specific than $P_j(y,x)$ iff $P_i(x,y)$ implies $P_j(y,x)$ (written $P_i(x,y) \rightarrow P_j(y,x)$) and the opposite does not hold.The hypothesis that $P_j(y,x)$ does not imply $P_i(x,y)$ excludes the case of opposite preference statements (e.g., white is b

Figures (11)

Figure 1: A list of wines
Figure 2: Taxonomies for the running example.
Figure 3: A set of wines for Example \ref{['ex:non-transitive-algorithm']}.
Figure 4: A transitively reduced graph showing containment between sequences. Dashed border for incomplete sequences; grey background for non-transitive sequences; blue background for minimal-transitive sequences. All containment relationships are strict.
Figure 5: Time for computing the formula: various settings.
...and 6 more figures

Theorems & Definitions (57)

Example 1
Definition 1: Taxonomy
Example 2
Definition 2: t-relation, t-schema, t-tuple
Example 3
Definition 3: Preference relation
Definition 4: Incomparability and Indifference
Example 4
Definition 5: Best operator
Example 5
...and 47 more

Preference Queries over Taxonomic Domains

TL;DR

Abstract

Preference Queries over Taxonomic Domains

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (57)