Table of Contents
Fetching ...

Transfer Faster, Price Smarter: Minimax Dynamic Pricing under Cross-Market Preference Shift

Yi Zhang, Elynn Chen, Yujun Yan

TL;DR

This work tackles cross-market contextual dynamic pricing with structured shifts in mean utilities across source markets. It introduces CM-TDP, a unified framework that operates in Offline-to-Online and Online-to-Online modes and supports both linear and RKHS-demand models, achieving minimax regret guarantees. The bias-corrected aggregation mechanism enables effective transfer from multiple sources, yielding substantial empirical gains (e.g., up to 50% reduction in cumulative regret) and faster learning in data-scarce targets. The approach bridges transfer learning, robust aggregation, and revenue optimization, offering a practical path to pricing systems that transfer information quickly while price decisions remain revenue-focused.

Abstract

We study contextual dynamic pricing when a target market can leverage K auxiliary markets -- offline logs or concurrent streams -- whose mean utilities differ by a structured preference shift. We propose Cross-Market Transfer Dynamic Pricing (CM-TDP), the first algorithm that provably handles such model-shift transfer and delivers minimax-optimal regret for both linear and non-parametric utility models. For linear utilities of dimension d, where the difference between source- and target-task coefficients is $s_{0}$-sparse, CM-TDP attains regret $\tilde{O}((d*K^{-1}+s_{0})\log T)$. For nonlinear demand residing in a reproducing kernel Hilbert space with effective dimension $α$, complexity $β$ and task-similarity parameter $H$, the regret becomes $\tilde{O}\!(K^{-2αβ/(2αβ+1)}T^{1/(2αβ+1)} + H^{2/(2α+1)}T^{1/(2α+1)})$, matching information-theoretic lower bounds up to logarithmic factors. The RKHS bound is the first of its kind for transfer pricing and is of independent interest. Extensive simulations show up to 50% lower cumulative regret and 5 times faster learning relative to single-market pricing baselines. By bridging transfer learning, robust aggregation, and revenue optimization, CM-TDP moves toward pricing systems that transfer faster, price smarter.

Transfer Faster, Price Smarter: Minimax Dynamic Pricing under Cross-Market Preference Shift

TL;DR

This work tackles cross-market contextual dynamic pricing with structured shifts in mean utilities across source markets. It introduces CM-TDP, a unified framework that operates in Offline-to-Online and Online-to-Online modes and supports both linear and RKHS-demand models, achieving minimax regret guarantees. The bias-corrected aggregation mechanism enables effective transfer from multiple sources, yielding substantial empirical gains (e.g., up to 50% reduction in cumulative regret) and faster learning in data-scarce targets. The approach bridges transfer learning, robust aggregation, and revenue optimization, offering a practical path to pricing systems that transfer information quickly while price decisions remain revenue-focused.

Abstract

We study contextual dynamic pricing when a target market can leverage K auxiliary markets -- offline logs or concurrent streams -- whose mean utilities differ by a structured preference shift. We propose Cross-Market Transfer Dynamic Pricing (CM-TDP), the first algorithm that provably handles such model-shift transfer and delivers minimax-optimal regret for both linear and non-parametric utility models. For linear utilities of dimension d, where the difference between source- and target-task coefficients is -sparse, CM-TDP attains regret . For nonlinear demand residing in a reproducing kernel Hilbert space with effective dimension , complexity and task-similarity parameter , the regret becomes , matching information-theoretic lower bounds up to logarithmic factors. The RKHS bound is the first of its kind for transfer pricing and is of independent interest. Extensive simulations show up to 50% lower cumulative regret and 5 times faster learning relative to single-market pricing baselines. By bridging transfer learning, robust aggregation, and revenue optimization, CM-TDP moves toward pricing systems that transfer faster, price smarter.

Paper Structure

This paper contains 43 sections, 29 theorems, 150 equations, 6 figures, 2 tables, 4 algorithms.

Key Result

Theorem 5

Consider linear utility model (eqn:17-linear-utility) with Assumptions assump:regu-phi (revenue regularity), assump:Homo-Cov (covariate property), assump:param (parameter space), and assump:simi_sparsity-linear (market similarity) holding true, the cumulative regret of Algorithm algo:general-o2o where $K$ and $T$ denote the number of source markets and time horizon, respectively.

Figures (6)

  • Figure 1: Cumulative regret across experimental conditions in O2Oon with linear utility model.
  • Figure 2: Cumulative regret across experimental conditions in O2Oon with RKHS utility model.
  • Figure 3: Cumulative regret across experimental conditions in O2Ooff with linear utility model.
  • Figure 4: Cumulative regret across experimental conditions in O2Ooff transfer with RKHS utility model.
  • Figure 5: Cumulative regret across experimental conditions in O2Oon with linear utility model.
  • ...and 1 more figures

Theorems & Definitions (38)

  • Remark 1
  • Theorem 5: Regret Upper Bound for O2Oon under Linear Utility
  • Theorem 6: Regret Lower Bound under Linear Utility
  • Remark 2
  • Remark 3
  • Theorem 10: Regret Upper Bound for O2Oon under RKHS Utility
  • Theorem 11: Regret Lower Bound under RKHS Utility
  • Theorem 12: Regret Upper Bound for O2Ooff under Linear Utility
  • Theorem 13: Regret Upper Bound for O2Ooff under RKHS Utility
  • Lemma 14: Upper bound for price
  • ...and 28 more