Spatial Supply Repositioning with Censored Demand Data
Hansheng Jiang, Chunlin Sun, Zuo-Jun Max Shen
TL;DR
This work addresses spatial supply repositioning in a closed, multi-location vehicle- sharing network under censored demand. It develops a continuous-state average-cost MDP, proves stationary optimal policies exist, and demonstrates that a base-stock repositioning policy is asymptotically near-optimal in large fleets or when lost-sales costs dominate. Offline, it provides exact MILP/LP reformulations to compute the best base-stock levels; online, it introduces the SOAR algorithm with regret $O\left(n^{2.5}\sqrt{T}\right)$, matching a fundamental lower bound up to polynomial factors. The results highlight the practical value of simple, interpretable policies and establish learning guarantees under censoring and network complexity, with extensions to heterogeneous durations and multi-subperiod settings.
Abstract
We consider a network inventory system motivated by one-way, on-demand vehicle sharing services. Under uncertain and correlated network demand, the service operator periodically repositions vehicles to match a fixed supply with spatial customer demand while minimizing costs. Finding an optimal repositioning policy in such a general inventory network is analytically and computationally challenging. We introduce a base-stock repositioning policy as a multidimensional generalization of the classical inventory rule to $n$ locations, and we establish its asymptotic optimality under two practically relevant regimes. We present exact reformulations that enable efficient computation of the best base-stock policy in an offline setting with historical data. In the online setting, we illustrate the challenges of learning with censored data in networked systems through a regret lower bound analysis and by demonstrating the suboptimality of alternative algorithmic approaches. We propose a Surrogate Optimization and Adaptive Repositioning algorithm and prove that it attains an optimal regret of $O(n^{2.5} \sqrt{T})$, which matches the regret lower bound in $T$ with polynomial dependence on $n$. Our work highlights the critical role of inventory repositioning in the viability of shared mobility businesses and illuminates the inherent challenges posed by data and network complexity. Our results demonstrate that simple, interpretable policies, such as the state-independent base-stock policies we analyze, can provide significant practical value and achieve near-optimal performance.
