Table of Contents
Fetching ...

Randomized batch-sampling Kaczmarz methods for general linear systems

Dong-Yue Xie, Xi Yang

TL;DR

The paper tackles solving general linear systems $A x=b$ with randomized Kaczmarz-type methods by introducing a unified randomized batch-sampling Kaczmarz (RBSK) framework. It derives concentration-inequality-based, scale-invariant mean-squared convergence bounds that apply to randomized non-extended block Kaczmarz methods with static samplings, and shows how the batch distribution ${\bf P}$ and a diagonal scaling $S$ influence the rate through quantities like $\xi$ and $\eta$. The work demonstrates that the new bounds are tighter and more aligned with empirical convergence than existing ND14 and GM21 bounds, across synthetic multi-scale and ill-conditioned matrices as well as real sparse data from SuiteSparse. It also highlights the potential of learning-guided batch-sampling distributions to tailor performance to specific applications while maintaining low per-iteration cost. Overall, the RBSK framework provides a flexible, theoretically sharp, and practically effective approach for fast randomized block Kaczmarz solvers in large-scale linear problems.

Abstract

To conduct a more in-depth investigation of randomized solvers for general linear systems, we adopt a unified randomized batch-sampling Kaczmarz framework with per-iteration costs as low as cyclic block methods, and develop a general analysis technique to establish its convergence guarantee. With concentration inequalities, we derive new expected linear convergence rate bounds. The analysis applies to any randomized non-extended block Kaczmarz methods with static stochastic samplings. In addition, the new rate bounds are scale-invariant which eliminate the dependence on the magnitude of the data matrix. In most experiments, the new bounds are significantly tighter than existing ones and better reflect the empirical convergence behavior of block methods. Within this new framework, the batch-sampling distribution, as a learnable parameter, provides the possibility for block methods to achieve efficient performance in specific application scenarios, which deserves further investigation.

Randomized batch-sampling Kaczmarz methods for general linear systems

TL;DR

The paper tackles solving general linear systems with randomized Kaczmarz-type methods by introducing a unified randomized batch-sampling Kaczmarz (RBSK) framework. It derives concentration-inequality-based, scale-invariant mean-squared convergence bounds that apply to randomized non-extended block Kaczmarz methods with static samplings, and shows how the batch distribution and a diagonal scaling influence the rate through quantities like and . The work demonstrates that the new bounds are tighter and more aligned with empirical convergence than existing ND14 and GM21 bounds, across synthetic multi-scale and ill-conditioned matrices as well as real sparse data from SuiteSparse. It also highlights the potential of learning-guided batch-sampling distributions to tailor performance to specific applications while maintaining low per-iteration cost. Overall, the RBSK framework provides a flexible, theoretically sharp, and practically effective approach for fast randomized block Kaczmarz solvers in large-scale linear problems.

Abstract

To conduct a more in-depth investigation of randomized solvers for general linear systems, we adopt a unified randomized batch-sampling Kaczmarz framework with per-iteration costs as low as cyclic block methods, and develop a general analysis technique to establish its convergence guarantee. With concentration inequalities, we derive new expected linear convergence rate bounds. The analysis applies to any randomized non-extended block Kaczmarz methods with static stochastic samplings. In addition, the new rate bounds are scale-invariant which eliminate the dependence on the magnitude of the data matrix. In most experiments, the new bounds are significantly tighter than existing ones and better reflect the empirical convergence behavior of block methods. Within this new framework, the batch-sampling distribution, as a learnable parameter, provides the possibility for block methods to achieve efficient performance in specific application scenarios, which deserves further investigation.

Paper Structure

This paper contains 12 sections, 14 theorems, 123 equations, 10 figures, 3 tables, 2 algorithms.

Key Result

Theorem 2.1

Let $X_1, \ldots, X_n$ be independent random variables such that $X_i$ takes its value in $\left[a_i,b_i\right]$ almost surely for all $i \leq n$. Consider the sum of these random variables, then, for $\epsilon>0$, it holds that

Figures (10)

  • Figure 1: Theoretical rate bound and empirical convergence rate for different block sizes when $m < n$, tested on a two-scale matrix constructed from a Gaussian random matrix. Subfigures (a), (b) and (c) correspond to block sizes $q=10$, $q=20$ and $q=50$.
  • Figure 2: Theoretical rate bound and empirical convergence rate for different block sizes when $m > n$, tested on a two-scale matrix constructed from a Gaussian random matrix. Subfigures (a), (b) and (c) correspond to block sizes $q=5$, $q=10$ and $q=20$.
  • Figure 3: Theoretical rate bound and empirical convergence rate for different block sizes, tested on a two-scale matrix constructed from $n$ columns of bibd_81_2. Subfigures (a), (b) and (c) correspond to block sizes $q=10$, $q=20$ and $q=50$.
  • Figure 4: Theoretical rate bound and empirical convergence rate for different block sizes, tested on a two-scale matrix constructed from $n$ columns of ch6-6-b5. Subfigures (a), (b) and (c) correspond to block sizes $q=10$, $q=20$ and $q=50$.
  • Figure 5: Theoretical rate bound and empirical convergence rate for different block sizes, tested on a two-scale matrix constructed from $n$ columns of n4c5-b7. Subfigures (a), (b) and (c) correspond to block sizes $q=10$, $q=20$ and $q=50$.
  • ...and 5 more figures

Theorems & Definitions (24)

  • Definition 2.1
  • Theorem 2.1: Hoeffding's inequality
  • Corollary 2.1
  • Definition 3.1
  • Example 3.1
  • Example 3.2
  • Example 3.3
  • Lemma 4.1
  • Lemma 4.2
  • Lemma 4.3
  • ...and 14 more