Guiding Catalogue Enrichment with User Queries
Yupei Du, Jacek Golebiowski, Philipp Schmidt, Ziawasch Abedjan
TL;DR
The paper tackles the challenge of enriching dynamic product catalogs by addressing the low precision and relevance issues in knowledge graph completion (KGC). It proposes a query-guided (QG) triplet prediction approach that leverages user SELECT queries to extract entity-predicate pairs and guides KGE-based predictions, thereby reducing the candidate search space from $|\,\mathcal{E}\,| \times |\,\mathcal{P}\,| \times |\,\mathcal{E}\,|$ to $|\mathcal{E}|$. The method is evaluated on public KGs (DBPedia and YAGO 4) with query logs, showing substantial improvements in both automatic and human evaluations over a rejection-sampling baseline and alternative guidance . The authors also provide an open dataset of 1600 annotated entity-predicate pairs and compare query guidance with KG metadata and embedding-score guidance, demonstrating the practical benefits of using user queries for open and commercial KGs. The work has clear implications for improving catalog enrichment and search experiences in real-world systems.
Abstract
Techniques for knowledge graph (KGs) enrichment have been increasingly crucial for commercial applications that rely on evolving product catalogues. However, because of the huge search space of potential enrichment, predictions from KG completion (KGC) methods suffer from low precision, making them unreliable for real-world catalogues. Moreover, candidate facts for enrichment have varied relevance to users. While making correct predictions for incomplete triplets in KGs has been the main focus of KGC method, the relevance of when to apply such predictions has been neglected. Motivated by the product search use case, we address the angle of generating relevant completion for a catalogue using user search behaviour and the users property association with a product. In this paper, we present our intuition for identifying enrichable data points and use general-purpose KGs to show-case the performance benefits. In particular, we extract entity-predicate pairs from user queries, which are more likely to be correct and relevant, and use these pairs to guide the prediction of KGC methods. We assess our method on two popular encyclopedia KGs, DBPedia and YAGO 4. Our results from both automatic and human evaluations show that query guidance can significantly improve the correctness and relevance of prediction.
