Solving the Content Gap in Roblox Game Recommendations: LLM-Based Profile Generation and Reranking
Chen Wang, Xiaokai Wei, Yexi Jiang, Frank Ong, Kevin Gao, Xiao Yu, Zheng Hui, Se-eun Yoon, Philip Yu, Michelle Gong
TL;DR
The paper tackles the content gap in Roblox game recommendations caused by noisy and sparse in-game text by introducing a two-stage approach: (1) Game Profile Generation that extracts raw in-game text and uses LLMs to generate structured, JSON-formatted game profiles detailing genre, objectives, mechanics, audience, language, and scale, and (2) an LLM-based reranker that builds a user profile from recent play history and crafts a personalized reranking strategy to refine the top-30 recommendations. The method is evaluated on Roblox data using GPT-4o, comparing against a baseline ID-based model and several content-based baselines, with NDCG Engagement as the primary metric. Results show that the personalized LLM-based reranker consistently improves ranking quality, especially at the top of the list, and increases user engagement, while ablation studies highlight the importance of personalization and the superiority of GPT-4o over smaller LLMs. The work demonstrates a scalable framework for content-driven recommendations in a dynamic, user-generated ecosystem and suggests practical extensions such as fine-tuning on Roblox data and incorporating multimodal signals to further enhance personalization and integrity detection in production settings.
Abstract
With the vast and dynamic user-generated content on Roblox, creating effective game recommendations requires a deep understanding of game content. Traditional recommendation models struggle with the inconsistent and sparse nature of game text features such as titles and descriptions. Recent advancements in large language models (LLMs) offer opportunities to enhance recommendation systems by analyzing in-game text data. This paper addresses two challenges: generating high-quality, structured text features for games without extensive human annotation, and validating these features to ensure they improve recommendation relevance. We propose an approach that extracts in-game text and uses LLMs to infer attributes such as genre and gameplay objectives from raw player interactions. Additionally, we introduce an LLM-based re-ranking mechanism to assess the effectiveness of the generated text features, enhancing personalization and user satisfaction. Beyond recommendations, our approach supports applications such as user engagement-based integrity detection, already deployed in production. This scalable framework demonstrates the potential of in-game text understanding to improve recommendation quality on Roblox and adapt recommendations to its unique, user-generated ecosystem.
