Table of Contents
Fetching ...

Privacy-Preserving Dynamic Assortment Selection

Young Hyun Cho, Will Wei Sun

TL;DR

A novel framework for privacy-preserving dynamic assortment selection using the multinomial logit (MNL) bandits model is presented, employing a perturbed upper confidence bound method and integrating calibrated noise into user utility estimates to balance between exploration and exploitation while ensuring robust privacy protection.

Abstract

With the growing demand for personalized assortment recommendations, concerns over data privacy have intensified, highlighting the urgent need for effective privacy-preserving strategies. This paper presents a novel framework for privacy-preserving dynamic assortment selection using the multinomial logit (MNL) bandits model. Our approach employs a perturbed upper confidence bound method, integrating calibrated noise into user utility estimates to balance between exploration and exploitation while ensuring robust privacy protection. We rigorously prove that our policy satisfies Joint Differential Privacy (JDP), which better suits dynamic environments than traditional differential privacy, effectively mitigating inference attack risks. This analysis is built upon a novel objective perturbation technique tailored for MNL bandits, which is also of independent interest. Theoretically, we derive a near-optimal regret bound of $\tilde{O}(\sqrt{T})$ for our policy and explicitly quantify how privacy protection impacts regret. Through extensive simulations and an application to the Expedia hotel dataset, we demonstrate substantial performance enhancements over the benchmark method.

Privacy-Preserving Dynamic Assortment Selection

TL;DR

A novel framework for privacy-preserving dynamic assortment selection using the multinomial logit (MNL) bandits model is presented, employing a perturbed upper confidence bound method and integrating calibrated noise into user utility estimates to balance between exploration and exploitation while ensuring robust privacy protection.

Abstract

With the growing demand for personalized assortment recommendations, concerns over data privacy have intensified, highlighting the urgent need for effective privacy-preserving strategies. This paper presents a novel framework for privacy-preserving dynamic assortment selection using the multinomial logit (MNL) bandits model. Our approach employs a perturbed upper confidence bound method, integrating calibrated noise into user utility estimates to balance between exploration and exploitation while ensuring robust privacy protection. We rigorously prove that our policy satisfies Joint Differential Privacy (JDP), which better suits dynamic environments than traditional differential privacy, effectively mitigating inference attack risks. This analysis is built upon a novel objective perturbation technique tailored for MNL bandits, which is also of independent interest. Theoretically, we derive a near-optimal regret bound of for our policy and explicitly quantify how privacy protection impacts regret. Through extensive simulations and an application to the Expedia hotel dataset, we demonstrate substantial performance enhancements over the benchmark method.

Paper Structure

This paper contains 39 sections, 105 equations, 8 figures, 1 table, 5 algorithms.

Figures (8)

  • Figure 1: Flow of Privacy-Preserving Dynamic Assortment Selection
  • Figure 2: Comparison of DP and JDP: DP ensures that the entire sequence of assortments remains similar across neighboring datasets, while JDP requires similarity for assortments excluding the target user's, allowing for more flexible personalized recommendation.
  • Figure 3: Overview of the mechanism design.
  • Figure 4: Comparison of cumulative regrets across different level of privacy parameter $\rho \in \{0.1, 0.5, 1\}$ with each line representing a different allocation of $\rho$ between PrivateMLE and PrivateCov. For instance, "MLE Allocation: 90%" indicates that 90% of $\rho$ is allocated to PrivateMLE, while the remaining 10% is allocated to PrivateCov.
  • Figure 5: Effect of privacy budget allocation on cumulative regret across different dimensions.
  • ...and 3 more figures

Theorems & Definitions (4)

  • proof
  • proof
  • proof
  • proof