Table of Contents
Fetching ...

Numerically stable variants of overrelaxation for operator Sinkhorn iteration

Henrik Eisenmann, Tasuku Soma, Xun Tang, André Uschmajew

Abstract

We consider accelerated versions of the operator Sinkhorn iteration (OSI) for solving scaling problems for completely positive maps. Based on the interpretation of OSI as alternating fixed point iteration, it has been recently proposed to achieve acceleration by means of nonlinear successive overrelaxation (SOR), e.g.~with respect to geodesics in Hilbert metric. The direct implementation of the proposed SOR algorithms, however, can be numerically unstable for ill-conditioned instances, limiting the achievable accuracy. Here we derive equivalent versions of OSI with SOR where, similar to the original OSI formulation, scalings are applied on the fly in order to take advantage of preconditioning effects. Numerical experiments confirm that this modification allows for numerically stable SOR-acceleration of OSI even in ill-conditioned cases.

Numerically stable variants of overrelaxation for operator Sinkhorn iteration

Abstract

We consider accelerated versions of the operator Sinkhorn iteration (OSI) for solving scaling problems for completely positive maps. Based on the interpretation of OSI as alternating fixed point iteration, it has been recently proposed to achieve acceleration by means of nonlinear successive overrelaxation (SOR), e.g.~with respect to geodesics in Hilbert metric. The direct implementation of the proposed SOR algorithms, however, can be numerically unstable for ill-conditioned instances, limiting the achievable accuracy. Here we derive equivalent versions of OSI with SOR where, similar to the original OSI formulation, scalings are applied on the fly in order to take advantage of preconditioning effects. Numerical experiments confirm that this modification allows for numerically stable SOR-acceleration of OSI even in ill-conditioned cases.
Paper Structure (10 sections, 3 theorems, 33 equations, 6 figures, 6 algorithms)

This paper contains 10 sections, 3 theorems, 33 equations, 6 figures, 6 algorithms.

Key Result

Lemma 2.1

Let $A$ be positive definite, $L_1$ be lower triangular so that $L_1^{} L_1^\top = A$ and $L_2$ are lower triangular with positive entries. Then $(L_2L_1)^{}(L_2L_1)^\top = L_2^{} A L_2^\top$ is the Cholesky decomposition of $L_2^{} A L_2^\top$.

Figures (6)

  • Figure 1: The results of the experiment with Hilbert matrices. The FPI methods are shown in solid lines, whereas the OSI methods are shown in dashed lines. In the runtime plot (\ref{['fig:Hilbert_time']}), the thick line and shaded region represent the mean and the standard deviation of 200 runs on the same instance. In both plots, the FPI methods suffer from numerical instability and get stuck at around $10^{-6}$, while the OSI methods can achieve much smaller error.
  • Figure 2: The results of the experiment with frame scaling. The FPI methods are shown in solid lines, whereas the OSI methods are shown in dashed lines. In the runtime plot (\ref{['fig:frame_time']}), the thick line and shaded region represent the mean and the standard deviation of 10 runs on the same instance, although the shaded region in not visible in the plot because the variance is small. In both plots, the FPI methods suffer from numerical instability and get stuck at around $10^{-3}$ to $10^{-5}$, while the OSI methods can achieve much smaller error. Moreover, SOR significantly accelerate the OSI methods, while only slightly accelerating the FPI methods before getting stuck.
  • Figure 3: The results of the experiment with the extreme setting. The FPI methods are shown in solid lines, whereas the OSI methods are shown in dashed lines. In the runtime plot (\ref{['fig:frame-twist_time']}), the thick line and shaded region represent the mean and the standard deviation of 10 runs on the same instance, although the shaded region in not visible in the plot because the variance is small. FPI and OSI (without SOR) are very slow. The FPI methods with SOR are slightly better but get stuck due to numerical instability. On the other hand, the OSI-SOR methods exhibit significant acceleration as well as better convergence error.
  • Figure : Alternating Fixed-point Iteration
  • Figure : Alternating Fixed-Point Iteration with Cholesky Factor SOR (FPI-Cholesky-SOR)
  • ...and 1 more figures

Theorems & Definitions (7)

  • Lemma 2.1
  • Proposition 2.2
  • proof
  • Remark 2.3
  • Proposition 2.4
  • proof
  • Remark 2.5