Table of Contents
Fetching ...

Solving the Content Gap in Roblox Game Recommendations: LLM-Based Profile Generation and Reranking

Chen Wang, Xiaokai Wei, Yexi Jiang, Frank Ong, Kevin Gao, Xiao Yu, Zheng Hui, Se-eun Yoon, Philip Yu, Michelle Gong

TL;DR

The paper tackles the content gap in Roblox game recommendations caused by noisy and sparse in-game text by introducing a two-stage approach: (1) Game Profile Generation that extracts raw in-game text and uses LLMs to generate structured, JSON-formatted game profiles detailing genre, objectives, mechanics, audience, language, and scale, and (2) an LLM-based reranker that builds a user profile from recent play history and crafts a personalized reranking strategy to refine the top-30 recommendations. The method is evaluated on Roblox data using GPT-4o, comparing against a baseline ID-based model and several content-based baselines, with NDCG Engagement as the primary metric. Results show that the personalized LLM-based reranker consistently improves ranking quality, especially at the top of the list, and increases user engagement, while ablation studies highlight the importance of personalization and the superiority of GPT-4o over smaller LLMs. The work demonstrates a scalable framework for content-driven recommendations in a dynamic, user-generated ecosystem and suggests practical extensions such as fine-tuning on Roblox data and incorporating multimodal signals to further enhance personalization and integrity detection in production settings.

Abstract

With the vast and dynamic user-generated content on Roblox, creating effective game recommendations requires a deep understanding of game content. Traditional recommendation models struggle with the inconsistent and sparse nature of game text features such as titles and descriptions. Recent advancements in large language models (LLMs) offer opportunities to enhance recommendation systems by analyzing in-game text data. This paper addresses two challenges: generating high-quality, structured text features for games without extensive human annotation, and validating these features to ensure they improve recommendation relevance. We propose an approach that extracts in-game text and uses LLMs to infer attributes such as genre and gameplay objectives from raw player interactions. Additionally, we introduce an LLM-based re-ranking mechanism to assess the effectiveness of the generated text features, enhancing personalization and user satisfaction. Beyond recommendations, our approach supports applications such as user engagement-based integrity detection, already deployed in production. This scalable framework demonstrates the potential of in-game text understanding to improve recommendation quality on Roblox and adapt recommendations to its unique, user-generated ecosystem.

Solving the Content Gap in Roblox Game Recommendations: LLM-Based Profile Generation and Reranking

TL;DR

The paper tackles the content gap in Roblox game recommendations caused by noisy and sparse in-game text by introducing a two-stage approach: (1) Game Profile Generation that extracts raw in-game text and uses LLMs to generate structured, JSON-formatted game profiles detailing genre, objectives, mechanics, audience, language, and scale, and (2) an LLM-based reranker that builds a user profile from recent play history and crafts a personalized reranking strategy to refine the top-30 recommendations. The method is evaluated on Roblox data using GPT-4o, comparing against a baseline ID-based model and several content-based baselines, with NDCG Engagement as the primary metric. Results show that the personalized LLM-based reranker consistently improves ranking quality, especially at the top of the list, and increases user engagement, while ablation studies highlight the importance of personalization and the superiority of GPT-4o over smaller LLMs. The work demonstrates a scalable framework for content-driven recommendations in a dynamic, user-generated ecosystem and suggests practical extensions such as fine-tuning on Roblox data and incorporating multimodal signals to further enhance personalization and integrity detection in production settings.

Abstract

With the vast and dynamic user-generated content on Roblox, creating effective game recommendations requires a deep understanding of game content. Traditional recommendation models struggle with the inconsistent and sparse nature of game text features such as titles and descriptions. Recent advancements in large language models (LLMs) offer opportunities to enhance recommendation systems by analyzing in-game text data. This paper addresses two challenges: generating high-quality, structured text features for games without extensive human annotation, and validating these features to ensure they improve recommendation relevance. We propose an approach that extracts in-game text and uses LLMs to infer attributes such as genre and gameplay objectives from raw player interactions. Additionally, we introduce an LLM-based re-ranking mechanism to assess the effectiveness of the generated text features, enhancing personalization and user satisfaction. Beyond recommendations, our approach supports applications such as user engagement-based integrity detection, already deployed in production. This scalable framework demonstrates the potential of in-game text understanding to improve recommendation quality on Roblox and adapt recommendations to its unique, user-generated ecosystem.

Paper Structure

This paper contains 34 sections, 1 equation, 4 figures, 2 tables, 1 algorithm.

Figures (4)

  • Figure 1: Illustration of the various types of in-game text in Roblox games, including: (1) Gameplay Instructions, guiding player actions and objectives; (2) Game Background Introduction, providing insights into genre and narrative; (3) Button Text for Player Actions, indicating mechanics like movement and interaction; (4) Promotional and Robux Spending Text; and (5) Noisy or Irrelevant Text.
  • Figure 2: Average Game Time Spent at Each Rank Position. The LLM-based reranker prioritizes games that users are more likely to engage with, as shown by the higher average time spent on games ranked at top positions.
  • Figure 3: The first row shows the original ranking list, and the second row displays the reranked results. The LLM-based reranker successfully prioritizes high-relevance games, such as adventure and obby genres, moving them to top positions based on user preferences.
  • Figure 4: Example of Limited Game Information in Old Polish Railway Classic. The developer-provided title and description lack sufficient detail about gameplay, objectives, or genre, making it difficult for players to understand the game.