Table of Contents
Fetching ...

Power Transform Revisited: Numerically Stable, and Federated

Xuefeng Xu, Graham Cormode

TL;DR

This work tackles numerical instability in power transforms (Box-Cox and Yeo-Johnson) used for variance stabilization and skew reduction by identifying instability sources and proposing a numerically stable pipeline that relies on log-domain computation, reformulated variance terms, and parameter bounding. It also extends power transforms to federated learning with a numerically stable one-pass variance aggregation, enabling reliable global optimization of the transform parameter $\lambda$ across heterogeneous clients using derivative-free optimization. Key contributions include a comprehensive instability analysis, practical remedies (including inverse transforms via the Lambert $W$ function and log-domain variance), and empirical validation on real and adversarial data, demonstrating improved stability over prior methods. The practical impact lies in more reliable preprocessing for statistical and ML pipelines, especially in privacy-preserving federated settings where communication efficiency and numerical robustness are critical. These methods offer a principled approach to stable variance normalization that can be readily integrated into existing analytics and FL workflows.

Abstract

Power transforms are popular parametric techniques for making data more Gaussian-like, and are widely used as preprocessing steps in statistical analysis and machine learning. However, we find that direct implementations of power transforms suffer from severe numerical instabilities, which can lead to incorrect results or even crashes. In this paper, we provide a comprehensive analysis of the sources of these instabilities and propose effective remedies. We further extend power transforms to the federated learning setting, addressing both numerical and distributional challenges that arise in this context. Experiments on real-world datasets demonstrate that our methods are both effective and robust, substantially improving stability compared to existing approaches.

Power Transform Revisited: Numerically Stable, and Federated

TL;DR

This work tackles numerical instability in power transforms (Box-Cox and Yeo-Johnson) used for variance stabilization and skew reduction by identifying instability sources and proposing a numerically stable pipeline that relies on log-domain computation, reformulated variance terms, and parameter bounding. It also extends power transforms to federated learning with a numerically stable one-pass variance aggregation, enabling reliable global optimization of the transform parameter across heterogeneous clients using derivative-free optimization. Key contributions include a comprehensive instability analysis, practical remedies (including inverse transforms via the Lambert function and log-domain variance), and empirical validation on real and adversarial data, demonstrating improved stability over prior methods. The practical impact lies in more reliable preprocessing for statistical and ML pipelines, especially in privacy-preserving federated settings where communication efficiency and numerical robustness are critical. These methods offer a principled approach to stable variance normalization that can be readily integrated into existing analytics and FL workflows.

Abstract

Power transforms are popular parametric techniques for making data more Gaussian-like, and are widely used as preprocessing steps in statistical analysis and machine learning. However, we find that direct implementations of power transforms suffer from severe numerical instabilities, which can lead to incorrect results or even crashes. In this paper, we provide a comprehensive analysis of the sources of these instabilities and propose effective remedies. We further extend power transforms to the federated learning setting, addressing both numerical and distributional challenges that arise in this context. Experiments on real-world datasets demonstrate that our methods are both effective and robust, substantially improving stability compared to existing approaches.

Paper Structure

This paper contains 18 sections, 1 theorem, 16 equations, 16 figures, 3 tables, 3 algorithms.

Key Result

Theorem 3.1

The Box-Cox transformation $\psi(\lambda,x)$ defined in eq:boxcox has the following properties:

Figures (16)

  • Figure 1: Transformation functions for different $\lambda$.
  • Figure 2: Tree-structured variance aggregation.
  • Figure 3: ExpSearch fails to find the true optimum $\lambda^*$. Left: data $[0.1, 0.1, 0.1, 0.101]$, $\lambda^*\approx -361$; Right: data $[10, 10, 10, 9.9]$, $\lambda^*\approx 358$.
  • Figure 4: Comparison of NLL curves using different equations. Data used: [10, 10, 10, 9.9].
  • Figure 5: Comparison of NLL curves using the naive and pairwise aggregation. Data are synthetic Gaussian samples from $\mathcal{N}(10^4, 10^{-3})$ with $100$ points.
  • ...and 11 more figures

Theorems & Definitions (1)

  • Theorem 3.1