Table of Contents
Fetching ...

Fast Evaluation of Truncated Neumann Series by Low-Product Radix Kernels

Piyush Sao

TL;DR

This work advances the efficient evaluation of truncated Neumann series by introducing exact radix-9 and approximate radix-15 kernels to reduce matrix-product counts in dense settings. It develops a general residual-based radix-kernel framework that accommodates spillover, preserving convergence while achieving a best-known asymptotic rate of about 1.54 products per doubling of the series length. The radix-9 kernel delivers a 21% improvement over binary splitting with exact rational coefficients, while the radix-15 approach attains the same 25% product savings in practice albeit with a small residual floor due to spillover. Together, these results offer practical pathways to faster inverse-approximation and polynomial preconditioning, with clear guidelines for selecting radix and handling nonideal kernels across matrix-iteration tasks.

Abstract

Truncated Neumann series $S_k(A)=I+A+\cdots+A^{k-1}$ are used in approximate matrix inversion and polynomial preconditioning. In dense settings, matrix-matrix products dominate the cost of evaluating $S_k$. Naive evaluation needs $k-1$ products, while splitting methods reduce this to $O(\log k)$. Repeated squaring, for example, uses $2\log_2 k$ products, so further gains require higher-radix kernels that extend the series by $m$ terms per update. Beyond the known radix-5 kernel, explicit higher-radix constructions were not available, and the existence of exact rational kernels was unclear. We construct radix kernels for $T_m(B)=I+B+\cdots+B^{m-1}$ and use them to build faster series algorithms. For radix 9, we derive an exact 3-product kernel with rational coefficients, which is the first exact construction beyond radix 5. This kernel yields $5\log_9 k=1.58\log_2 k$ products, a 21% reduction from repeated squaring. For radix 15, numerical optimization yields a 4-product kernel that matches the target through degree 14 but has nonzero spillover (extra terms) at degrees $\ge 15$. Because spillover breaks the standard telescoping update, we introduce a residual-based radix-kernel framework that accommodates approximate kernels and retains coefficient $(μ_m+2)/\log_2 m$. Within this framework, radix 15 attains $6/\log_2 15\approx 1.54$, the best known asymptotic rate. Numerical experiments support the predicted product-count savings and associated runtime trends.

Fast Evaluation of Truncated Neumann Series by Low-Product Radix Kernels

TL;DR

This work advances the efficient evaluation of truncated Neumann series by introducing exact radix-9 and approximate radix-15 kernels to reduce matrix-product counts in dense settings. It develops a general residual-based radix-kernel framework that accommodates spillover, preserving convergence while achieving a best-known asymptotic rate of about 1.54 products per doubling of the series length. The radix-9 kernel delivers a 21% improvement over binary splitting with exact rational coefficients, while the radix-15 approach attains the same 25% product savings in practice albeit with a small residual floor due to spillover. Together, these results offer practical pathways to faster inverse-approximation and polynomial preconditioning, with clear guidelines for selecting radix and handling nonideal kernels across matrix-iteration tasks.

Abstract

Truncated Neumann series are used in approximate matrix inversion and polynomial preconditioning. In dense settings, matrix-matrix products dominate the cost of evaluating . Naive evaluation needs products, while splitting methods reduce this to . Repeated squaring, for example, uses products, so further gains require higher-radix kernels that extend the series by terms per update. Beyond the known radix-5 kernel, explicit higher-radix constructions were not available, and the existence of exact rational kernels was unclear. We construct radix kernels for and use them to build faster series algorithms. For radix 9, we derive an exact 3-product kernel with rational coefficients, which is the first exact construction beyond radix 5. This kernel yields products, a 21% reduction from repeated squaring. For radix 15, numerical optimization yields a 4-product kernel that matches the target through degree 14 but has nonzero spillover (extra terms) at degrees . Because spillover breaks the standard telescoping update, we introduce a residual-based radix-kernel framework that accommodates approximate kernels and retains coefficient . Within this framework, radix 15 attains , the best known asymptotic rate. Numerical experiments support the predicted product-count savings and associated runtime trends.
Paper Structure (43 sections, 7 theorems, 17 equations, 2 figures, 5 tables)

This paper contains 43 sections, 7 theorems, 17 equations, 2 figures, 5 tables.

Key Result

Theorem 3.1

The kernel $T_9(B) = I + B + \cdots + B^8$ is computed by:

Figures (2)

  • Figure 1: Convergence comparison ($d=64$, $\kappa(I-A) = 10^{4}$, log-spaced eigenvalues). Higher-radix methods reach machine precision with fewer matrix products.
  • Figure 2: Asymptotic coefficients (matrix products per $\log_2 k$) for each method. Lower is better. Radix-15 achieves the best rate at $1.54$.

Theorems & Definitions (19)

  • Theorem 3.1: Radix-9 kernel in 3 products
  • proof
  • Remark 3.2: Rational coefficients
  • Remark 4.1: Non-uniqueness
  • Definition 5.1: Approximate radix-$m$ kernel
  • Definition 5.2: Error map
  • Lemma 5.3: Error order
  • proof
  • Definition 5.4: General radix-kernel summation
  • Lemma 5.5: Composition identity
  • ...and 9 more