Data-Driven Solution Portfolios
Marina Drygala, Silvio Lattanzi, Andreas Maggiori, Miltiadis Stouras, Ola Svensson, Sergei Vassilvitskii
TL;DR
This work introduces a data-driven, offline framework for Portfolio Optimization under stochastic value functions, focusing on matroid constraints. It defines the objective as maximizing the expected best value among $k$ offlinely chosen solutions under a distribution $\mathcal{D}$ over additive value functions, with a particular emphasis on independence and anti-concentration to ensure diversification. The authors develop a polynomial-time algorithm that achieves a $\Theta(1)$-approximation to the optimum when $\mathcal{D}$ is a product distribution, first for uniform matroids and then for all matroids using a column-decomposition and a contention-resolution-scheme framework, together with a conditioning technique to handle exceptional events. The approach combines a prefix-based filtering strategy with two portfolio constructions (uniform and column) and a careful analytical treatment via CRS to address dependencies, yielding practical, data-driven guarantees for a broad class of combinatorial optimization problems under uncertainty. Overall, the paper provides a principled, scalable method to precompute diversified portfolios that perform well across likely scenarios in stochastic combinatorial settings, enabling faster decision-making in applications like routing, scheduling, and competitive domains.
Abstract
In this paper, we consider a new problem of portfolio optimization using stochastic information. In a setting where there is some uncertainty, we ask how to best select $k$ potential solutions, with the goal of optimizing the value of the best solution. More formally, given a combinatorial problem $Π$, a set of value functions $V$ over the solutions of $Π$, and a distribution $D$ over $V$, our goal is to select $k$ solutions of $Π$ that maximize or minimize the expected value of the {\em best} of those solutions. For a simple example, consider the classic knapsack problem: given a universe of elements each with unit weight and a positive value, the task is to select $r$ elements maximizing the total value. Now suppose that each element's weight comes from a (known) distribution. How should we select $k$ different solutions so that one of them is likely to yield a high value? In this work, we tackle this basic problem, and generalize it to the setting where the underlying set system forms a matroid. On the technical side, it is clear that the candidate solutions we select must be diverse and anti-correlated; however, it is not clear how to do so efficiently. Our main result is a polynomial-time algorithm that constructs a portfolio within a constant factor of the optimal.
