A Framework for Ranking Content Providers Using Prompt Engineering and Self-Attention Network
Gosuddin Kamaruddin Siddiqi, Deven Santhosh Shah, Radhika Bansal, Askar Kamalov
TL;DR
The paper tackles the challenge of ranking Content Providers for topic-aware content recommendation by integrating explicit user feedback with content-based signals. It advances from a weak supervised baseline to a ground-truth–driven framework built through Prompt Engineering and SME-guided judgments, supplemented by a Self-Attention neural network trained on a Listwise Ranking objective to address cold-start providers. LightGBM Pairwise ranking is used as a comparative baseline, while the Self-Attention approach delivers stronger listwise performance and scalability across topics, languages, and regions. Online A/B experiments demonstrate improvements in alignment with brand missions, user engagement, and content quality, though it notes limitations for Local Content and suggests geography-aware extensions for further gains.
Abstract
This paper addresses the problem of ranking Content Providers for Content Recommendation System. Content Providers are the sources of news and other types of content, such as lifestyle, travel, gardening. We propose a framework that leverages explicit user feedback, such as clicks and reactions, and content-based features, such as writing style and frequency of publishing, to rank Content Providers for a given topic. We also use language models to engineer prompts that help us create a ground truth dataset for the previous unsupervised ranking problem. Using this ground truth, we expand with a self-attention based network to train on Learning to Rank ListWise task. We evaluate our framework using online experiments and show that it can improve the quality, credibility, and diversity of the content recommended to users.
