Efficient and Near-Optimal Online Portfolio Selection
Rémi Jézéquel, Dmitrii M. Ostrovskii, Pierre Gaillard
TL;DR
This work addresses online portfolio selection by delivering a computationally feasible algorithm that matches Cover's optimal regret up to a constant factor and a log-factor replacement. The proposed VB-FTRL approach introduces a time-varying volumetric barrier to the regularizer, producing negative terms in the regret analysis that offset variance-like terms and yield a tight bound of $O(d\log(T+d))$. The method connects to Cover's Universal Portfolios via a variational and Gaussian-approximation viewpoint, linking online learning with cutting-plane and interior-point techniques, and provides an efficient Newton-based implementation with per-round cost $\tilde{O}(d^2T+d^3)$. The result is a practical, affine-invariant algorithm that scales to large $d$ and $T$ while retaining near-optimal worst-case regret against the best constant CRP. This work thus bridges classical optimization barriers and online learning performance, enabling robust, scalable portfolio strategies in adversarial settings.
Abstract
In the problem of online portfolio selection as formulated by Cover (1991), the trader repeatedly distributes her capital over $ d $ assets in each of $ T > 1 $ rounds, with the goal of maximizing the total return. Cover proposed an algorithm, termed Universal Portfolios, that performs nearly as well as the best (in hindsight) static assignment of a portfolio, with an $ O(d\log(T)) $ regret in terms of the logarithmic return. Without imposing any restrictions on the market this guarantee is known to be worst-case optimal, and no other algorithm attaining it has been discovered so far. Unfortunately, Cover's algorithm crucially relies on computing certain $ d $-dimensional integral which must be approximated in any implementation; this results in a prohibitive $ \tilde O(d^4(T+d)^{14}) $ per-round runtime for the fastest known implementation due to Kalai and Vempala (2002). We propose an algorithm for online portfolio selection that admits essentially the same regret guarantee as Universal Portfolios -- up to a constant factor and replacement of $ \log(T) $ with $ \log(T+d) $ -- yet has a drastically reduced runtime of $ \tilde O(d^2(T+d)) $ per round. The selected portfolio minimizes the current logarithmic loss regularized by the log-determinant of its Hessian -- equivalently, the hybrid logarithmic-volumetric barrier of the polytope specified by the asset return vectors. As such, our work reveals surprising connections of online portfolio selection with two classical topics in optimization theory: cutting-plane and interior-point algorithms.
