High-Probability Polynomial-Time Complexity of Restarted PDHG for Linear Programming
Zikai Xiong
TL;DR
This work establishes, for restarted primal-dual hybrid gradient methods solving linear programs, high-probability polynomial-time bounds under a probabilistic model with sub-Gaussian (and Gaussian) data. It reveals a two-stage convergence: an initial basis identification stage with bound roughly proportional to a dimension-dependent factor, followed by a fast local convergence stage with a log(1/ε) dependence. The analysis leverages non-asymptotic random-matrix theory to bound key quantities like the smallest singular value, condition numbers, and disparity measures of the optimal solution, and it is complemented by experimental validation of tail behavior and polynomial scaling. The results explain the practical efficiency of rPDHG and offer guidance for generating challenging LP instances via the disparity among optimal components. Overall, this work bridges theory and practice for a scalable first-order LP method, with implications for solver design and benchmark construction.
Abstract
The restarted primal-dual hybrid gradient method (rPDHG) is a first-order method that has recently received significant attention for its computational effectiveness in solving linear program (LP) problems. Despite its impressive practical performance, the theoretical iteration bounds for rPDHG can be exponentially poor. To shrink this gap between theory and practice, we show that rPDHG achieves polynomial-time complexity in a high-probability sense, under assumptions on the probability distribution from which the data instance is generated. We consider not only Gaussian distribution models but also sub-Gaussian distribution models as well. For standard-form LP instances with $m$ linear constraints and $n$ decision variables, we prove that rPDHG iterates settle on the optimal basis in $\widetilde{O}\left(\tfrac{n^{2.5}m^{0.5}}δ\right)$ iterations, followed by $O\left(\frac{n^{0.5}m^{0.5}}δ\ln\big(\tfrac{1}{\varepsilon}\big)\right)$ iterations to compute an $\varepsilon$-optimal solution. These bounds hold with probability at least $1-δ$ for $δ$ that is not exponentially small. The first-stage bound further improves to $\widetilde{O}\left(\frac{n^{2.5}}δ\right)$ in the Gaussian distribution model. Experimental results confirm the tail behavior and the polynomial-time dependence on problem dimensions of the iteration counts. As an application of our probabilistic analysis, we explore how the disparity among the components of the optimal solution bears on the performance of rPDHG, and we provide guidelines for generating challenging LP test instance.
