On the number of iterations of the DBA algorithm
Frederik Brüning, Anne Driemel, Alperen Ergür, Heiko Röglin
TL;DR
This paper analyzes the iteration complexity of the DTW Barycenter Averaging (DBA) algorithm, which seeks a mean time series by minimizing the sum of DTW distances. It proves an exponential worst-case bound in the output length $k$, even for $n=2$, and establishes a polynomial-smoothed upper bound under Gaussian perturbations, highlighting a gap between theory and practice. A matching exponential lower bound is shown via a construction based on Vattani's $k$-means lower-bound framework, while experiments on the M5 dataset indicate much smaller, sublinear iteration growth in real data. The work also adapts techniques from $k$-means analysis, revealing both the potential and limitations of applying those methods to DBA due to non-monotonic DTW behavior, and it opens avenues for refined worst-case and smoothed analyses that better reflect practical performance.
Abstract
The DTW Barycenter Averaging (DBA) algorithm is a widely used algorithm for estimating the mean of a given set of point sequences. In this context, the mean is defined as a point sequence that minimises the sum of dynamic time warping distances (DTW). The algorithm is similar to the $k$-means algorithm in the sense that it alternately repeats two steps: (1) computing an optimal assignment to the points of the current mean, and (2) computing an optimal mean under the current assignment. The popularity of DBA can be attributed to the fact that it works well in practice, despite any theoretical guarantees to be known. In our paper, we aim to initiate a theoretical study of the number of iterations that DBA performs until convergence. We assume the algorithm is given $n$ sequences of $m$ points in $\mathbb{R}^d$ and a parameter $k$ that specifies the length of the mean sequence to be computed. We show that, in contrast to its fast running time in practice, the number of iterations can be exponential in $k$ in the worst case - even if the number of input sequences is $n=2$. We complement these findings with experiments on real-world data that suggest this worst-case behaviour is likely degenerate. To better understand the performance of the algorithm on non-degenerate input, we study DBA in the model of smoothed analysis, upper-bounding the expected number of iterations in the worst case under random perturbations of the input. Our smoothed upper bound is polynomial in $k$, $n$ and $d$, and for constant $n$, it is also polynomial in $m$. For our analysis, we adapt the set of techniques that were developed for analysing $k$-means and observe that this set of techniques is not sufficient to obtain tight bounds for general $n$.
