Table of Contents
Fetching ...

Personalized Image Generation for Recommendations Beyond Catalogs

Gabriel Patron, Zhiwei Xu, Ishan Kapnadak, Felipe Maia Polo

TL;DR

REBECA tackles personalization in diffusion-based image generation by learning a lightweight user-conditioned diffusion prior from implicit feedback and decoupling personalization from the image generator. The method samples personalized CLIP-space embeddings from $p_{\hat{\theta}}(I^e \mid U,R)$ and decodes them with a frozen backbone, enabling scalable, fine-tuning-free customization across many users. A rigorous evaluation framework, including a personalization verifier and permutation tests, demonstrates strong alignment with individual preferences on synthetic and real datasets, while maintaining high image quality. The work enables practical, large-scale personalized generation for recommender-style applications without the computational burden of per-user fine-tuning or LLM mediation.

Abstract

Personalization is central to human-AI interaction, yet current diffusion-based image generation systems remain largely insensitive to user diversity. Existing attempts to address this often rely on costly paired preference data or introduce latency through Large Language Models. In this work, we introduce REBECA (REcommendations BEyond CAtalogs), a lightweight and scalable framework for personalized image generation that learns directly from implicit feedback signals such as likes, ratings, and clicks. Instead of fine-tuning the underlying diffusion model, REBECA employs a two-stage process: training a conditional diffusion model to sample user- and rating-specific image embeddings, which are subsequently decoded into images using a pretrained diffusion backbone. This approach enables efficient, fine-tuning-free personalization across large user bases. We rigorously evaluate REBECA on real-world datasets, proposing a novel statistical personalization verifier and a permutation-based hypothesis test to assess preference alignment. Our results demonstrate that REBECA consistently produces high-fidelity images tailored to individual tastes, outperforming baselines while maintaining computational efficiency.

Personalized Image Generation for Recommendations Beyond Catalogs

TL;DR

REBECA tackles personalization in diffusion-based image generation by learning a lightweight user-conditioned diffusion prior from implicit feedback and decoupling personalization from the image generator. The method samples personalized CLIP-space embeddings from and decodes them with a frozen backbone, enabling scalable, fine-tuning-free customization across many users. A rigorous evaluation framework, including a personalization verifier and permutation tests, demonstrates strong alignment with individual preferences on synthetic and real datasets, while maintaining high image quality. The work enables practical, large-scale personalized generation for recommender-style applications without the computational burden of per-user fine-tuning or LLM mediation.

Abstract

Personalization is central to human-AI interaction, yet current diffusion-based image generation systems remain largely insensitive to user diversity. Existing attempts to address this often rely on costly paired preference data or introduce latency through Large Language Models. In this work, we introduce REBECA (REcommendations BEyond CAtalogs), a lightweight and scalable framework for personalized image generation that learns directly from implicit feedback signals such as likes, ratings, and clicks. Instead of fine-tuning the underlying diffusion model, REBECA employs a two-stage process: training a conditional diffusion model to sample user- and rating-specific image embeddings, which are subsequently decoded into images using a pretrained diffusion backbone. This approach enables efficient, fine-tuning-free personalization across large user bases. We rigorously evaluate REBECA on real-world datasets, proposing a novel statistical personalization verifier and a permutation-based hypothesis test to assess preference alignment. Our results demonstrate that REBECA consistently produces high-fidelity images tailored to individual tastes, outperforming baselines while maintaining computational efficiency.

Paper Structure

This paper contains 44 sections, 9 equations, 14 figures, 5 tables, 2 algorithms.

Figures (14)

  • Figure 1: REBECA learns a user-conditioned diffusion prior whose geometry spans the preference manifolds of all users. Conditioning selects a region of this shared embedding space, from which the prior samples diverse yet personalized embeddings.
  • Figure 2: REBECA overview.Training: Conditional diffusion prior trained to generate personalized image embeddings from user IDs and ratings. Inference: Generated embeddings are decoded into images via a pretrained image decoder model.
  • Figure 3: UMAP umapMcInnes2018 projection of VAE embeddings. Color and shape clusters are cleanly separated.
  • Figure 4: Per-user visualization in the controlled setting. Top: liked samples for each user. Bottom: images generated by REBECA using the frozen VAE decoder. REBECA captures each user's preference manifold while maintaining diversity.
  • Figure 5: Comparison of personalization performance across generation approaches. REBECA achieves the highest user scores, surpassing VLM-based baselines and LoRA fine-tuning variants.
  • ...and 9 more figures