Accurate and Efficient Profile Matching in Knowledge Bases
Jorge Martinez-Gil, Alejandra Lorena Paoletti, Gábor Rácz, Attila Sali, Klaus-Dieter Schewe
TL;DR
This work presents a lattice-based, filter-centric theory for profile matching in knowledge bases, where profiles are represented as filters in a lattice derived from description-logics-based knowledge bases and matching scores reside in the interval $[0,1]$. It establishes a method to learn ranking-preserving matching measures from expert input under plausibility constraints, turning human judgments into a scalable weighting scheme via linear inequalities, and shows the existence of such measures under mild conditions. The framework supports efficient top-$k$ and gap queries through precomputation and specialized data structures, decoupling knowledge-base maintenance from profile-instance updates. It further extends the core approach with fuzzy and probabilistic extensions, including a maximum-entropy probabilistic matching model and a maximum-length matching via graph-based extensions, enabling richer, uncertainty-aware profile matching with practical relevance to recruitment and similar domains.
Abstract
A profile describes a set of properties, e.g. a set of skills a person may have, a set of skills required for a particular job, or a set of abilities a football player may have with respect to a particular team strategy. Profile matching aims to determine how well a given profile fits to a requested profile. The approach taken in this article is grounded in a matching theory that uses filters in lattices to represent profiles, and matching values in the interval [0,1]: the higher the matching value the better is the fit. Such lattices can be derived from knowledge bases exploiting description logics to represent the knowledge about profiles. An interesting first question is, how human expertise concerning the matching can be exploited to obtain most accurate matchings. It will be shown that if a set of filters together with matching values by some human expert is given, then under some mild plausibility assumptions a matching measure can be determined such that the computed matching values preserve the rankings given by the expert. A second question concerns the efficient querying of databases of profile instances. For matching queries that result in a ranked list of profile instances matching a given one it will be shown how corresponding top-k queries can be evaluated on grounds of pre-computed matching values, which in turn allows the maintenance of the knowledge base to be decoupled from the maintenance of profile instances. In addition, it will be shown how the matching queries can be exploited for gap queries that determine how profile instances need to be extended in order to improve in the rankings. Finally, the theory of matching will be extended beyond the filters, which lead to a matching theory that exploits fuzzy sets or probabilistic logic with maximum entropy semantics. It will be shown that added fuzzy links can be captured by extending the underlying lattice.
