Table of Contents
Fetching ...

Guiding Catalogue Enrichment with User Queries

Yupei Du, Jacek Golebiowski, Philipp Schmidt, Ziawasch Abedjan

TL;DR

The paper tackles the challenge of enriching dynamic product catalogs by addressing the low precision and relevance issues in knowledge graph completion (KGC). It proposes a query-guided (QG) triplet prediction approach that leverages user SELECT queries to extract entity-predicate pairs and guides KGE-based predictions, thereby reducing the candidate search space from $|\,\mathcal{E}\,| \times |\,\mathcal{P}\,| \times |\,\mathcal{E}\,|$ to $|\mathcal{E}|$. The method is evaluated on public KGs (DBPedia and YAGO 4) with query logs, showing substantial improvements in both automatic and human evaluations over a rejection-sampling baseline and alternative guidance . The authors also provide an open dataset of 1600 annotated entity-predicate pairs and compare query guidance with KG metadata and embedding-score guidance, demonstrating the practical benefits of using user queries for open and commercial KGs. The work has clear implications for improving catalog enrichment and search experiences in real-world systems.

Abstract

Techniques for knowledge graph (KGs) enrichment have been increasingly crucial for commercial applications that rely on evolving product catalogues. However, because of the huge search space of potential enrichment, predictions from KG completion (KGC) methods suffer from low precision, making them unreliable for real-world catalogues. Moreover, candidate facts for enrichment have varied relevance to users. While making correct predictions for incomplete triplets in KGs has been the main focus of KGC method, the relevance of when to apply such predictions has been neglected. Motivated by the product search use case, we address the angle of generating relevant completion for a catalogue using user search behaviour and the users property association with a product. In this paper, we present our intuition for identifying enrichable data points and use general-purpose KGs to show-case the performance benefits. In particular, we extract entity-predicate pairs from user queries, which are more likely to be correct and relevant, and use these pairs to guide the prediction of KGC methods. We assess our method on two popular encyclopedia KGs, DBPedia and YAGO 4. Our results from both automatic and human evaluations show that query guidance can significantly improve the correctness and relevance of prediction.

Guiding Catalogue Enrichment with User Queries

TL;DR

The paper tackles the challenge of enriching dynamic product catalogs by addressing the low precision and relevance issues in knowledge graph completion (KGC). It proposes a query-guided (QG) triplet prediction approach that leverages user SELECT queries to extract entity-predicate pairs and guides KGE-based predictions, thereby reducing the candidate search space from to . The method is evaluated on public KGs (DBPedia and YAGO 4) with query logs, showing substantial improvements in both automatic and human evaluations over a rejection-sampling baseline and alternative guidance . The authors also provide an open dataset of 1600 annotated entity-predicate pairs and compare query guidance with KG metadata and embedding-score guidance, demonstrating the practical benefits of using user queries for open and commercial KGs. The work has clear implications for improving catalog enrichment and search experiences in real-world systems.

Abstract

Techniques for knowledge graph (KGs) enrichment have been increasingly crucial for commercial applications that rely on evolving product catalogues. However, because of the huge search space of potential enrichment, predictions from KG completion (KGC) methods suffer from low precision, making them unreliable for real-world catalogues. Moreover, candidate facts for enrichment have varied relevance to users. While making correct predictions for incomplete triplets in KGs has been the main focus of KGC method, the relevance of when to apply such predictions has been neglected. Motivated by the product search use case, we address the angle of generating relevant completion for a catalogue using user search behaviour and the users property association with a product. In this paper, we present our intuition for identifying enrichable data points and use general-purpose KGs to show-case the performance benefits. In particular, we extract entity-predicate pairs from user queries, which are more likely to be correct and relevant, and use these pairs to guide the prediction of KGC methods. We assess our method on two popular encyclopedia KGs, DBPedia and YAGO 4. Our results from both automatic and human evaluations show that query guidance can significantly improve the correctness and relevance of prediction.
Paper Structure (27 sections, 2 figures, 3 tables)

This paper contains 27 sections, 2 figures, 3 tables.

Figures (2)

  • Figure 1: An example of using query logs to guide prediction. In this example, we can make prediction on the entity "Marie Curie" using one of the predicates from "birthplace", "head quarter", and "associated band". Because the query selects the birthplace of Marie Curie, we make predictions from this entity-predicate pair.
  • Figure 2: Automatic evaluation of embedding score guidance. Y-axis is the precision score of each group, and X-axis shows the indices of the groups sorted by embedding scores, in which the larger is the group index the lower is the embedding score: embedding score guidance can help missing triplet prediction, but worse than query guidance.