Automatic Construction of Multi-faceted User Profiles using Text Clustering and its Application to Expert Recommendation and Filtering Problems
Luis M. de Campos, Juan M. Fernández-Luna, Juan F. Huete, Luis Redondo-Expósito
TL;DR
This work tackles expert finding and document filtering by automatic construction of multi-faceted user profiles through text clustering. It presents local and global clustering schemes to uncover latent topics from documents, generating subprofiles that are fused into an overall ranking for experts. Evaluated on a parliamentary corpus, the approach improves over baselines and reveals when global versus local clustering is advantageous, as well as how the number of clusters and algorithm choice impact performance. The findings demonstrate that clustering-based subprofiles can more accurately represent user interests and enhance information access tasks with practical implications for large-scale, topic-rich domains. Overall, the paper provides a coherent, device-agnostic framework for building and using multi-faceted profiles in expert recommendation and document filtering.
Abstract
In the information age we are living in today, not only are we interested in accessing multimedia objects such as documents, videos, etc. but also in searching for professional experts, people or celebrities, possibly for professional needs or just for fun. Information access systems need to be able to extract and exploit various sources of information (usually in text format) about such individuals, and to represent them in a suitable way usually in the form of a profile. In this article, we tackle the problems of profile-based expert recommendation and document filtering from a machine learning perspective by clustering expert textual sources to build profiles and capture the different hidden topics in which the experts are interested. The experts will then be represented by means of multi-faceted profiles. Our experiments show that this is a valid technique to improve the performance of expert finding and document filtering.
