Computing $k$-means in mixed precision
Erin Carson, Xinye Chen, Xiaobo Liu
TL;DR
The paper addresses accelerating k-means with mixed-precision arithmetic while preserving numerical stability. It establishes the stability of the standard distance formulation, proposes a two-level mixed-precision framework (initialization and distance computations in low precision, center updates in working precision), and provides theoretical bounds for mixed-precision distance errors. Through extensive experiments on synthetic data, real-world datasets, and image segmentation, it shows that normalization improves tolerance to reduced precision and that substantial portions of distance computations can be performed in low precision without significantly harming clustering quality. The findings indicate meaningful speedups and energy savings with mp-kmeans, especially when using FP16, while highlighting limitations related to overflow in nonnormalized data and underflow in very low-precision settings. Overall, this work opens pathways for applying mixed-precision strategies to distance-based learning algorithms beyond k-means.
Abstract
The $k$-means algorithm is one of the most popular and critical techniques in data mining and machine learning, and it has achieved significant success in numerous science and engineering domains. Computing $k$-means to a global optimum is NP-hard in Euclidean space, yet there are a variety of efficient heuristic algorithms, such as Lloyd's algorithm, that converge to a local optimum with superpolynomial complexity in the worst case. Motivated by the emergence and prominence of mixed precision capabilities in hardware, a current trend is to develop low and mixed precision variants of algorithms in order to improve the runtime and energy consumption. In this paper we study the numerical stability of Lloyd's $k$-means algorithm, and, in particular, we confirm the stability of the widely used distance computation formula. We propose a mixed-precision framework for $k$-means computation and investigate the effects of low-precision distance computation within the framework. Through extensive simulations on various data clustering and image segmentation tasks, we verify the applicability and robustness of the mixed precision $k$-means method. We find that, in $k$-means computation, normalized data is more tolerant to the reduction of precision in the distance computation, while for nonnormalized data more care is needed in the use of reduced precision, mainly to avoid overflow. Our study demonstrates the potential for the use of mixed precision to accelerate the $k$-means computation and offers some insights into other distance-based machine learning methods.
