Private Optimal Inventory Policy Learning for Feature-based Newsvendor with Unknown Demand

Tuoyi Zhao; Wen-xin Zhou; Lan Wang

Private Optimal Inventory Policy Learning for Feature-based Newsvendor with Unknown Demand

Tuoyi Zhao, Wen-xin Zhou, Lan Wang

TL;DR

The paper tackles privacy-preserving learning of an optimal feature-based newsvendor policy when demand is unknown. It introduces an $f$-differential privacy framework and a computationally efficient algorithm that combines convolution smoothing with clipped Gaussian gradient updates to estimate the policy $q(oldsymbol{x})=oldsymbol{x}^ opoldsymbol{eta}$ while protecting individual data. Finite-sample high-probability bounds on estimation error and regret are derived, showing that privacy-augmented learning can achieve near-optimal performance as data size grows, with excess risk scaling as $ ext{log}(n)ig((p+ ext{log}n)/( extmu n)ig)^2$ plus a linear-in-$p/n$ term. Numerical experiments on synthetic and real data corroborate the privacy-utility trade-off, demonstrating desirable privacy with only marginal cost increases relative to non-private benchmarks. Overall, the work advances private data-driven inventory optimization by exploiting the newsvendor structure, smoothing techniques, and $f$-DP composition to yield provable privacy guarantees and tight statistical performance guarantees.

Abstract

The data-driven newsvendor problem with features has recently emerged as a significant area of research, driven by the proliferation of data across various sectors such as retail, supply chains, e-commerce, and healthcare. Given the sensitive nature of customer or organizational data often used in feature-based analysis, it is crucial to ensure individual privacy to uphold trust and confidence. Despite its importance, privacy preservation in the context of inventory planning remains unexplored. A key challenge is the nonsmoothness of the newsvendor loss function, which sets it apart from existing work on privacy-preserving algorithms in other settings. This paper introduces a novel approach to estimate a privacy-preserving optimal inventory policy within the f-differential privacy framework, an extension of the classical $(ε, δ)$-differential privacy with several appealing properties. We develop a clipped noisy gradient descent algorithm based on convolution smoothing for optimal inventory estimation to simultaneously address three main challenges: (1) unknown demand distribution and nonsmooth loss function; (2) provable privacy guarantees for individual-level data; and (3) desirable statistical precision. We derive finite-sample high-probability bounds for optimal policy parameter estimation and regret analysis. By leveraging the structure of the newsvendor problem, we attain a faster excess population risk bound compared to that obtained from an indiscriminate application of existing results for general nonsmooth convex loss. Our bound aligns with that for strongly convex and smooth loss function. Our numerical experiments demonstrate that the proposed new method can achieve desirable privacy protection with a marginal increase in cost.

Private Optimal Inventory Policy Learning for Feature-based Newsvendor with Unknown Demand

TL;DR

The paper tackles privacy-preserving learning of an optimal feature-based newsvendor policy when demand is unknown. It introduces an

-differential privacy framework and a computationally efficient algorithm that combines convolution smoothing with clipped Gaussian gradient updates to estimate the policy

while protecting individual data. Finite-sample high-probability bounds on estimation error and regret are derived, showing that privacy-augmented learning can achieve near-optimal performance as data size grows, with excess risk scaling as

plus a linear-in-

term. Numerical experiments on synthetic and real data corroborate the privacy-utility trade-off, demonstrating desirable privacy with only marginal cost increases relative to non-private benchmarks. Overall, the work advances private data-driven inventory optimization by exploiting the newsvendor structure, smoothing techniques, and

-DP composition to yield provable privacy guarantees and tight statistical performance guarantees.

Abstract

-differential privacy with several appealing properties. We develop a clipped noisy gradient descent algorithm based on convolution smoothing for optimal inventory estimation to simultaneously address three main challenges: (1) unknown demand distribution and nonsmooth loss function; (2) provable privacy guarantees for individual-level data; and (3) desirable statistical precision. We derive finite-sample high-probability bounds for optimal policy parameter estimation and regret analysis. By leveraging the structure of the newsvendor problem, we attain a faster excess population risk bound compared to that obtained from an indiscriminate application of existing results for general nonsmooth convex loss. Our bound aligns with that for strongly convex and smooth loss function. Our numerical experiments demonstrate that the proposed new method can achieve desirable privacy protection with a marginal increase in cost.

Paper Structure (40 sections, 19 theorems, 190 equations, 9 figures, 2 tables, 1 algorithm)

This paper contains 40 sections, 19 theorems, 190 equations, 9 figures, 2 tables, 1 algorithm.

Introduction
Contributions
Provable privacy-protection guarantee in the $f$-differential privacy framework.
A computationally efficient algorithm to estimate the feature-based optimal inventory policy with unknown demand function.
Finite-sample performance bounds and excess risk analysis.
Notation and Organization
Related Review
Problem Formulation
Feature-based Newsvendor Problem
Convolution Smoothing for Empirical Risk Minimization
Assumptions
A Privacy-Preserving Algorithm for Feature-based Newsvendor Problem
Preliminaries on $f$-differential Privacy
Proposed Differentially Private Algorithm
Privacy-protection Guarantee of the Proposed Algorithm
...and 25 more sections

Key Result

Lemma 1

Let $K$ be a symmetric, non-negative kernel function with $\kappa_1 := \int_{-\infty}^\infty |u| K(u) {\rm d}u <\infty$. For any $\varpi >0$, it holds uniformly over $u\in \mathbb{R}$ that $\rho_\tau(u) \leq (\rho_\tau *K_\varpi) (u) \leq \rho_\tau(u) + \kappa_1 \varpi /2$.

Figures (9)

Figure 1: Illustration of privacy protection
Figure 2: Illustration of two smoothed check/quantile loss functions
Figure 3: Trade-off functions for GDP with privacy parameter $\mu$=0.3, 0.5, 0.9, 3 and 6, respectively.
Figure 4: Estimation errors and regrets of different estimators when $\varepsilon\sim \mathcal{N}(0,1)$
Figure 5: Estimation errors and regrets of different estimators when $\varepsilon\sim t_3$
...and 4 more figures

Theorems & Definitions (29)

Lemma 1
Remark 1
Remark 2
Definition 1
Definition 2
Definition 3
Definition 4
Proposition 1
Definition 5
Proposition 2
...and 19 more

Private Optimal Inventory Policy Learning for Feature-based Newsvendor with Unknown Demand

TL;DR

Abstract

Private Optimal Inventory Policy Learning for Feature-based Newsvendor with Unknown Demand

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (29)