Efficient Data-Driven Optimization with Noisy Data
Bart P. G. Van Parys
TL;DR
This work addresses data-driven optimization when observations are corrupted by a known noise mechanism. It extends efficient robust prescriptions to the noisy regime by introducing a δ-smoothed large deviations rate, linking the resulting ambigiity sets to entropic optimal transport and providing out-of-sample guarantees. A Strassen-based representation yields tractable finite formulations, complemented by a dual reformulation that reduces variables to be linear in the noisy data size. Under identifiability assumptions and careful decay of the robustness and smoothing parameters, the approach achieves consistency with finite-sample guarantees, offering practically implementable, statistically sound prescriptions for decision-making with noisy data. The framework thus provides a principled, efficient alternative to naive sample-average methods in settings with measurement noise and nontrivial observation operators.
Abstract
Classical Kullback-Leibler or entropic distances are known to enjoy certain desirable statistical properties in the context of decision-making with noiseless data. However, in most practical situations the data available to a decision maker is subject to a certain amount of measurement noise. We hence study here data-driven prescription problems in which the data is corrupted by a known noise source. We derive efficient data-driven formulations in this noisy regime and indicate that they enjoy an entropic optimal transport interpretation. Finally, we show that these efficient robust formulations are tractable in several interesting settings by exploiting a classical representation result by Strassen.
