Guiding Catalogue Enrichment with User Queries

Yupei Du; Jacek Golebiowski; Philipp Schmidt; Ziawasch Abedjan

Guiding Catalogue Enrichment with User Queries

Yupei Du, Jacek Golebiowski, Philipp Schmidt, Ziawasch Abedjan

TL;DR

The paper tackles the challenge of enriching dynamic product catalogs by addressing the low precision and relevance issues in knowledge graph completion (KGC). It proposes a query-guided (QG) triplet prediction approach that leverages user SELECT queries to extract entity-predicate pairs and guides KGE-based predictions, thereby reducing the candidate search space from $|\,\mathcal{E}\,| \times |\,\mathcal{P}\,| \times |\,\mathcal{E}\,|$ to $|\mathcal{E}|$. The method is evaluated on public KGs (DBPedia and YAGO 4) with query logs, showing substantial improvements in both automatic and human evaluations over a rejection-sampling baseline and alternative guidance . The authors also provide an open dataset of 1600 annotated entity-predicate pairs and compare query guidance with KG metadata and embedding-score guidance, demonstrating the practical benefits of using user queries for open and commercial KGs. The work has clear implications for improving catalog enrichment and search experiences in real-world systems.

Abstract

Techniques for knowledge graph (KGs) enrichment have been increasingly crucial for commercial applications that rely on evolving product catalogues. However, because of the huge search space of potential enrichment, predictions from KG completion (KGC) methods suffer from low precision, making them unreliable for real-world catalogues. Moreover, candidate facts for enrichment have varied relevance to users. While making correct predictions for incomplete triplets in KGs has been the main focus of KGC method, the relevance of when to apply such predictions has been neglected. Motivated by the product search use case, we address the angle of generating relevant completion for a catalogue using user search behaviour and the users property association with a product. In this paper, we present our intuition for identifying enrichable data points and use general-purpose KGs to show-case the performance benefits. In particular, we extract entity-predicate pairs from user queries, which are more likely to be correct and relevant, and use these pairs to guide the prediction of KGC methods. We assess our method on two popular encyclopedia KGs, DBPedia and YAGO 4. Our results from both automatic and human evaluations show that query guidance can significantly improve the correctness and relevance of prediction.

Guiding Catalogue Enrichment with User Queries

TL;DR

. The method is evaluated on public KGs (DBPedia and YAGO 4) with query logs, showing substantial improvements in both automatic and human evaluations over a rejection-sampling baseline and alternative guidance . The authors also provide an open dataset of 1600 annotated entity-predicate pairs and compare query guidance with KG metadata and embedding-score guidance, demonstrating the practical benefits of using user queries for open and commercial KGs. The work has clear implications for improving catalog enrichment and search experiences in real-world systems.

Abstract

Paper Structure (27 sections, 2 figures, 3 tables)

This paper contains 27 sections, 2 figures, 3 tables.

Introduction
Limitations in KGC:
Contributions:
Background and Related Work
Knowledge Graphs
Knowledge Graph Embeddings and RotatE
Rule-Based Knowledge Graph Completion
Query-Guided Triplet Prediction
Prediction from KGE using Rejection Sampling (RS)
Guided Prediction with Queries (QG)
Comparison with Selecting Top-k Queries
Evaluation and Results
Experimental Setup
KGs and Query Logs
Pre-processing of KGs and Query logs
...and 12 more sections

Figures (2)

Figure 1: An example of using query logs to guide prediction. In this example, we can make prediction on the entity "Marie Curie" using one of the predicates from "birthplace", "head quarter", and "associated band". Because the query selects the birthplace of Marie Curie, we make predictions from this entity-predicate pair.
Figure 2: Automatic evaluation of embedding score guidance. Y-axis is the precision score of each group, and X-axis shows the indices of the groups sorted by embedding scores, in which the larger is the group index the lower is the embedding score: embedding score guidance can help missing triplet prediction, but worse than query guidance.

Guiding Catalogue Enrichment with User Queries

TL;DR

Abstract

Guiding Catalogue Enrichment with User Queries

Authors

TL;DR

Abstract

Table of Contents

Figures (2)