Table of Contents
Fetching ...

Sustainable transparency in Recommender Systems: Bayesian Ranking of Images for Explainability

Jorge Paz-Ruza, Amparo Alonso-Betanzos, Berta Guijarro-Berdiñas, Brais Cancela, Carlos Eiras-Franco

TL;DR

BRIE reframes image-based explanations for recommender systems as a Bayesian ranking task, improving over ELVis and MF-ELVis by using Bayesian Pairwise Ranking (BPR) and an extended negative-sampling strategy. The model learns from user-uploaded images with a simple dot-product backbone and dropout, achieving higher ranking performance (notably MAUC) while drastically reducing model size and environmental impact. Evaluated on six real-world restaurant datasets, BRIE consistently outperforms baselines and demonstrates substantially lower training and inference emissions. The work advances explainable AI in RS by delivering more trustworthy, scalable, and green visual explanations that leverage existing user-generated content rather than synthetic or text-based signals.

Abstract

Recommender Systems have become crucial in the modern world, commonly guiding users towards relevant content or products, and having a large influence over the decisions of users and citizens. However, ensuring transparency and user trust in these systems remains a challenge; personalized explanations have emerged as a solution, offering justifications for recommendations. Among the existing approaches for generating personalized explanations, using existing visual content created by users is a promising option to maximize transparency and user trust. State-of-the-art models that follow this approach, despite leveraging highly optimized architectures, employ surrogate learning tasks that do not efficiently model the objective of ranking images as explanations for a given recommendation; this leads to a suboptimal training process with high computational costs that may not be reduced without affecting model performance. This work presents BRIE, a novel model where we leverage Bayesian Pairwise Ranking to enhance the training process, allowing us to consistently outperform state-of-the-art models in six real-world datasets while reducing its model size by up to 64 times and its CO2 emissions by up to 75% in training and inference.

Sustainable transparency in Recommender Systems: Bayesian Ranking of Images for Explainability

TL;DR

BRIE reframes image-based explanations for recommender systems as a Bayesian ranking task, improving over ELVis and MF-ELVis by using Bayesian Pairwise Ranking (BPR) and an extended negative-sampling strategy. The model learns from user-uploaded images with a simple dot-product backbone and dropout, achieving higher ranking performance (notably MAUC) while drastically reducing model size and environmental impact. Evaluated on six real-world restaurant datasets, BRIE consistently outperforms baselines and demonstrates substantially lower training and inference emissions. The work advances explainable AI in RS by delivering more trustworthy, scalable, and green visual explanations that leverage existing user-generated content rather than synthetic or text-based signals.

Abstract

Recommender Systems have become crucial in the modern world, commonly guiding users towards relevant content or products, and having a large influence over the decisions of users and citizens. However, ensuring transparency and user trust in these systems remains a challenge; personalized explanations have emerged as a solution, offering justifications for recommendations. Among the existing approaches for generating personalized explanations, using existing visual content created by users is a promising option to maximize transparency and user trust. State-of-the-art models that follow this approach, despite leveraging highly optimized architectures, employ surrogate learning tasks that do not efficiently model the objective of ranking images as explanations for a given recommendation; this leads to a suboptimal training process with high computational costs that may not be reduced without affecting model performance. This work presents BRIE, a novel model where we leverage Bayesian Pairwise Ranking to enhance the training process, allowing us to consistently outperform state-of-the-art models in six real-world datasets while reducing its model size by up to 64 times and its CO2 emissions by up to 75% in training and inference.
Paper Structure (26 sections, 5 equations, 9 figures, 3 tables)

This paper contains 26 sections, 5 equations, 9 figures, 3 tables.

Figures (9)

  • Figure 1: Overview of the use of user-uploaded images for visual-based explainability and personalization recommendation. Users upload images to justify their experiences with an item, so image features are good explanations for the user; a model can learn these features for each user (left half). Then, the appeareance of any entering recommendation can be personalized (here, with an image of the recommended restaurant) using the item's image that best reflects the learnt explanatory preferences of the user; the chosen image explanation will vary between users depending on their preferences (right side).
  • Figure 2: Network topologies of ELVis and MF-ELVis, the two existing models for visual explainability through user-uploaded images in Recommender Systems. Both models take as inputs the IDs of user $u$ and photograph $p$, and output the predicted authorship $\mathbf{\hat{R}}_{up}$, i.e. how good is $p$ as an explanation of a recommendation of its associated item for user $u$. During training, both models minimize BCE loss $\mathcal{L}_\text{BCE}$ based on the predicted and real authorship $\mathbf{\hat{R}}_{up}$ and $\mathbf{R}_{up}$.
  • Figure 3: Negative sampling strategies of ELVis and MF-ELVis (left) versus BRIE (right), used for each original sample $(u, i, p)$ in the set of historical interactions $\mathcal{D}$.
  • Figure 4: Network topology of BRIE. While training, BRIE receives IDs of user $u$, a photograph $p$ taken by $u$, and a random photograph $p_{neg}$ assumed to not be adequate for $u$, and must maximize the difference of their predicted authorship probabilities ($\mathbf{\hat{R}}_{up}-\mathbf{\hat{R}}_{up_{neg}}$) to minimize BPR loss $\mathcal{L}_\textrm{BPR}$. During inference, BRIE receives IDs of user $u$ and photograph $p$, and outputs the predicted authorship $\mathbf{\hat{R}}_{up}$, i.e. how good is $p$ as an explanation of a recommendation of its associated item for user $u$.
  • Figure 5: Effect of the amount of information per user on the median percentile (lower is better) of all models. Each subfigure displays the median percentile (y-axis) as a function of the minimum activity threshold (x-axis), i.e. the minimum required number of photographs by the user present in the training set. The count of available test cases for each minimum activity threshold is also provided as an aid in assessing the statistical significance of the results.
  • ...and 4 more figures