Table of Contents
Fetching ...

Accelerating operator Sinkhorn iteration with overrelaxation

Tasuku Soma, André Uschmajew

TL;DR

The local convergence rates of these accelerated versions of the operator Sinkhorn iteration for operator scaling using successive overrelaxation are analyzed via linearization and the asymptotically optimal relaxation parameter based on Young's SOR theorem is determined.

Abstract

We propose accelerated versions of the operator Sinkhorn iteration for operator scaling using successive overrelaxation. We analyze the local convergence rates of these accelerated methods via linearization, which allows us to determine the asymptotically optimal relaxation parameter based on Young's SOR theorem. Using the Hilbert metric on positive definite cones, we also obtain a global convergence result for a geodesic version of overrelaxation in a specific range of relaxation parameters. These techniques generalize corresponding results obtained for matrix scaling by Thibault et al. (Algorithms, 14(5):143, 2021) and Lehmann et al. (Optim. Lett., 16(8):2209--2220, 2022). Numerical experiments demonstrate that the proposed methods outperform the original operator Sinkhorn iteration in certain applications.

Accelerating operator Sinkhorn iteration with overrelaxation

TL;DR

The local convergence rates of these accelerated versions of the operator Sinkhorn iteration for operator scaling using successive overrelaxation are analyzed via linearization and the asymptotically optimal relaxation parameter based on Young's SOR theorem is determined.

Abstract

We propose accelerated versions of the operator Sinkhorn iteration for operator scaling using successive overrelaxation. We analyze the local convergence rates of these accelerated methods via linearization, which allows us to determine the asymptotically optimal relaxation parameter based on Young's SOR theorem. Using the Hilbert metric on positive definite cones, we also obtain a global convergence result for a geodesic version of overrelaxation in a specific range of relaxation parameters. These techniques generalize corresponding results obtained for matrix scaling by Thibault et al. (Algorithms, 14(5):143, 2021) and Lehmann et al. (Optim. Lett., 16(8):2209--2220, 2022). Numerical experiments demonstrate that the proposed methods outperform the original operator Sinkhorn iteration in certain applications.

Paper Structure

This paper contains 30 sections, 14 theorems, 116 equations, 2 figures, 6 algorithms.

Key Result

Lemma 2.2

Given matrices $B_1,\dots,B_k$ such that $\sum_{i=1}^k B_i^\top B_i$ is positive definite. Then for any symmetric positive definite $M$ (of appropriate size) the matrix $\sum_{i=1}^k B_i^\top M B_i$ is positive definite as well.

Figures (2)

  • Figure 1: Experimental results for frame scaling. (\ref{['fig:frame_iter']}) Plot of the gradient norm against iterations. (\ref{['fig:frame_time']}) Plot of the gradient norm against running time. The thick line and shaded region represent the mean and the standard deviation of 10 runs on the same instance.
  • Figure 2: Experimental results for ill-conditioned operators. (\ref{['fig:Hilbert_iter']}) Plot of the gradient norm against iterations. (\ref{['fig:Hilbert_time']}) Plot of the gradient norm against running time. The thick line and shaded region represent the mean and the standard deviation of 10 runs on the same instance.

Theorems & Definitions (24)

  • Lemma 2.2
  • proof
  • Proposition 2.3
  • Definition 2.4
  • Theorem 2.5: cf. Georgiou2015 and Idel2016
  • proof
  • Lemma 2.6
  • proof
  • Lemma 3.1
  • Lemma 3.2
  • ...and 14 more