Ellipsoid fitting with the Cayley transform
Omar Melikechi, David B. Dunson
TL;DR
CTEF introduces a principled, ellipsoid-specific fitting framework based on the Cayley transform to recast rotation parameters and enforce bound-constrained optimization in Euclidean space. By aligning the fit to transformed data with a carefully designed feasible set, it achieves invariance to translations and rotations and robust performance when data are nonuniformly distributed on ellipsoids, across high dimensions. The method yields interpretable ellipsoid parameters that support dimension reduction, visualization, and clustering, and demonstrates superior performance on synthetic Ellipsoid-Gaussian data as well as real-world tasks like cell-cycle visualization and circadian gene analysis. While slower than some baselines, CTEF is deterministic and reproducible, with practical runtimes and clear geometric interpretations that make it well-suited for applications requiring stable, global structure recovery.
Abstract
We introduce Cayley transform ellipsoid fitting (CTEF), an algorithm that uses the Cayley transform to fit ellipsoids to noisy data in any dimension. Unlike many ellipsoid fitting methods, CTEF is ellipsoid specific, meaning it always returns elliptic solutions, and can fit arbitrary ellipsoids. It also significantly outperforms other fitting methods when data are not uniformly distributed over the surface of an ellipsoid. Inspired by growing calls for interpretable and reproducible methods in machine learning, we apply CTEF to dimension reduction, data visualization, and clustering in the context of cell cycle and circadian rhythm data and several classical toy examples. Since CTEF captures global curvature, it extracts nonlinear features in data that other machine learning methods fail to identify. For example, on the clustering examples CTEF outperforms 10 popular algorithms.
