Acceleration and restart for the randomized Bregman-Kaczmarz method
Lionel Tondji, Ion Necoara, Dirk A. Lorenz
TL;DR
We address solving convex linearly constrained problems of the form $\min f(x)$ subject to $\mathbf{A}x=b$, with $f$ $1$-strongly convex and possibly nonsmooth, via a block randomized Bregman-Kaczmarz framework. The authors develop the Block Accelerated Randomized Bregman-Kaczmarz (ARBK) method and a Restarted variant (RARBK) by mapping dual coordinate-descent updates to the primal space using $x^k = \nabla f^*(\mathbf{A}^T y^k)$, and they establish a PL-based error bound for the dual. Theoretical contributions show $O(1/k^2)$ convergence in the primal for ARBK and linear convergence under PL-type conditions for both ARBK and the restarted scheme, with convergence rates depending on the number of blocks $M$. Numerical experiments on synthetic linear systems and CT reconstruction demonstrate substantial speedups over existing block Kaczmarz and coordinate-descent approaches, supported by an open-source Python implementation.
Abstract
Optimizing strongly convex functions subject to linear constraints is a fundamental problem with numerous applications. In this work, we propose a block (accelerated) randomized Bregman-Kaczmarz method that only uses a block of constraints in each iteration to tackle this problem. We consider a dual formulation of this problem in order to deal in an efficient way with the linear constraints. Using convex tools, we show that the corresponding dual function satisfies the Polyak-Lojasiewicz (PL) property, provided that the primal objective function is strongly convex and verifies additionally some other mild assumptions. However, adapting the existing theory on coordinate descent methods to our dual formulation can only give us sublinear convergence results in the dual space. In order to obtain convergence results in some criterion corresponding to the primal (original) problem, we transfer our algorithm to the primal space, which combined with the PL property allows us to get linear convergence rates. More specifically, we provide a theoretical analysis of the convergence of our proposed method under different assumptions on the objective and demonstrate in the numerical experiments its superior efficiency and speed up compared to existing methods for the same problem.
