Interactive Garment Recommendation with User in the Loop

Federico Becattini; Xiaolin Chen; Andrea Puccia; Haokun Wen; Xuemeng Song; Liqiang Nie; Alberto Del Bimbo

Interactive Garment Recommendation with User in the Loop

Federico Becattini, Xiaolin Chen, Andrea Puccia, Haokun Wen, Xuemeng Song, Liqiang Nie, Alberto Del Bimbo

TL;DR

The paper tackles interactive garment recommendation when no prior user data is available. It introduces a reinforcement-learning agent that maintains a state and uses a $Q(s,a)$ policy to select bottom garments for a given top, updating the state with feedback-weighted item features; training relies on a GP-BPR proxy to simulate user responses. Key contributions include the first multi-turn interactive fashion recommendation framework, integration of a GP-BPR proxy for training, and empirical validation on the IQON3000 dataset showing improved personalization and the critical role of exploration. The work demonstrates the feasibility of on-the-fly user profiling and iterative refinement using implicit feedback, with potential deployment in both store-based and online settings.

Abstract

Recommending fashion items often leverages rich user profiles and makes targeted suggestions based on past history and previous purchases. In this paper, we work under the assumption that no prior knowledge is given about a user. We propose to build a user profile on the fly by integrating user reactions as we recommend complementary items to compose an outfit. We present a reinforcement learning agent capable of suggesting appropriate garments and ingesting user feedback so to improve its recommendations and maximize user satisfaction. To train such a model, we resort to a proxy model to be able to simulate having user feedback in the training loop. We experiment on the IQON3000 fashion dataset and we find that a reinforcement learning-based agent becomes capable of improving its recommendations by taking into account personal preferences. Furthermore, such task demonstrated to be hard for non-reinforcement models, that cannot exploit exploration during training.

Interactive Garment Recommendation with User in the Loop

TL;DR

The paper tackles interactive garment recommendation when no prior user data is available. It introduces a reinforcement-learning agent that maintains a state and uses a

policy to select bottom garments for a given top, updating the state with feedback-weighted item features; training relies on a GP-BPR proxy to simulate user responses. Key contributions include the first multi-turn interactive fashion recommendation framework, integration of a GP-BPR proxy for training, and empirical validation on the IQON3000 dataset showing improved personalization and the critical role of exploration. The work demonstrates the feasibility of on-the-fly user profiling and iterative refinement using implicit feedback, with potential deployment in both store-based and online settings.

Abstract

Paper Structure (16 sections, 5 equations, 10 figures, 3 tables, 1 algorithm)

This paper contains 16 sections, 5 equations, 10 figures, 3 tables, 1 algorithm.

Introduction
Related work
Fashion Recommendation
Feedback-guided Fashion Search
Reinforcement Learning
Problem Statement and Application Context
Interactive Garment Recommendation Model
Training the system
Preprocessing
User Proxy Model
Training the Interaction
Experimental setting
Dataset and metrics
Results
Qualitative Results
...and 1 more sections

Figures (10)

Figure 1: Compatible outfit recommendation through iterative feedback. The user presents a top garment to the system, which generates a recommendation, namely a bottom garment to be paired with the top. The user then provides feedback rating, according to personal preferences towards the generated outfit. The recommendation system integrates such feedback and proposes a new bottom to the user, attempting to maximize the feedback. The system does not have any prior knowledge about the user.
Figure 2: Overview of the proposed approach. The system is initialized with the visual feature of the top garment presented by the user. The reinforcement learning agent, based on its policy, proposes an initial recommendation. The user then provides a feedback score, which is used to update the internal state of the agent. Based on such updated state, the agent proposes a new bottom, attempting to maximize the feedback score at every iteration. The Q-Function $Q$ selects the best action among the action space $A$. The function is approximated with a three-layer feedforward neural network, that projects the internal state of the agent into a distribution over all possible actions. The best action (i.e. the most compatible bottom) is selected by taking the most likely one according to the estimated likelihood.
Figure 3: We used K-means to group similar bottom garments before performing recommendations. These are two examples of clusters obtained from clustering images in the IQON3000 dataset. TO perform recommendation the model recommends a whole cluster by suggesting the garment that is closest to the centroid of the cluster.
Figure 4: Unnormalized score distribution of GP-BPR on the validation set of IQON. We perform a min-max normalization to bound scores in [0, 1].
Figure 5: Average GP-BPR scores obtained by the recommendation system at different iterations. At first, the system suggests a garment without knowing any prior information about the user. Therefore, such a recommendation cannot take into account personal preferences. As the user starts providing feedback, the model is able to suggest on average a better item to complement the given top for the given user.
...and 5 more figures

Interactive Garment Recommendation with User in the Loop

TL;DR

Abstract

Interactive Garment Recommendation with User in the Loop

Authors

TL;DR

Abstract

Table of Contents

Figures (10)