Power Transform Revisited: Numerically Stable, and Federated
Xuefeng Xu, Graham Cormode
TL;DR
This work tackles numerical instability in power transforms (Box-Cox and Yeo-Johnson) used for variance stabilization and skew reduction by identifying instability sources and proposing a numerically stable pipeline that relies on log-domain computation, reformulated variance terms, and parameter bounding. It also extends power transforms to federated learning with a numerically stable one-pass variance aggregation, enabling reliable global optimization of the transform parameter $\lambda$ across heterogeneous clients using derivative-free optimization. Key contributions include a comprehensive instability analysis, practical remedies (including inverse transforms via the Lambert $W$ function and log-domain variance), and empirical validation on real and adversarial data, demonstrating improved stability over prior methods. The practical impact lies in more reliable preprocessing for statistical and ML pipelines, especially in privacy-preserving federated settings where communication efficiency and numerical robustness are critical. These methods offer a principled approach to stable variance normalization that can be readily integrated into existing analytics and FL workflows.
Abstract
Power transforms are popular parametric techniques for making data more Gaussian-like, and are widely used as preprocessing steps in statistical analysis and machine learning. However, we find that direct implementations of power transforms suffer from severe numerical instabilities, which can lead to incorrect results or even crashes. In this paper, we provide a comprehensive analysis of the sources of these instabilities and propose effective remedies. We further extend power transforms to the federated learning setting, addressing both numerical and distributional challenges that arise in this context. Experiments on real-world datasets demonstrate that our methods are both effective and robust, substantially improving stability compared to existing approaches.
