Time-Uniform Self-Normalized Concentration for Vector-Valued Processes
Justin Whitehouse, Zhiwei Steven Wu, Aaditya Ramdas
TL;DR
This work develops time-uniform self-normalized concentration bounds for vector-valued martingales under a general sub-$\psi$ tail condition, extending scalar results to $\mathbb{R}^d$ via directional projections and a geometric sphere-covering argument. The main contributions include a scalar bound $S_t \lesssim V_t (\psi^*)^{-1}(\frac{1}{V_t}\log\log V_t)$, a corresponding vector bound on $\| (V_t)^{-1/2} S_t \|$, and a tight multivariate law of the iterated logarithm with explicit dependence on $\gamma_{\max}(V_t)$ and $\kappa(V_t)$. The results yield non-asymptotic, time-uniform confidence ellipsoids for online linear regression under sub-$\psi$ noise, a multivariate empirical Bernstein inequality for bounded vectors, and extend to vector autoregressive models, offering practical tools for sequential estimation under heavy-tailed or dependent noise. By providing closed-form bounds with controllable constants, the framework broadens applicability beyond sub-Gaussian settings and enables robust analysis of adaptive statistical procedures in online and time-series contexts.
Abstract
Self-normalized processes arise naturally in many learning-related tasks. While self-normalized concentration has been extensively studied for scalar-valued processes, there are few results for multidimensional processes outside of the sub-Gaussian setting. In this work, we construct a general, self-normalized inequality for multivariate processes that satisfy a simple yet broad sub-$ψ$ tail condition, which generalizes assumptions based on cumulant generating functions. From this general inequality, we derive an upper law of the iterated logarithm for sub-$ψ$ vector-valued processes, which is tight up to small constants. We show how our inequality can be leveraged to derive a variety of novel, self-normalized concentration inequalities under both light and heavy-tailed observations. Further, we provide applications in prototypical statistical tasks, such as parameter estimation in online linear regression, autoregressive modeling, and bounded mean estimation via a new (multivariate) empirical Bernstein concentration inequality.
