Generalized Dimension Reduction Using Semi-Relaxed Gromov-Wasserstein Distance
Ranthony A. Clark, Tom Needham, Thomas Weighill
TL;DR
The paper introduces a manifold-valued extension of multidimensional scaling by exploiting the semi-relaxed Gromov-Wasserstein (srGW) distance, linking dimensionality reduction to optimal transport and generalized Gromov-Hausdorff distances. It proves the existence of Monge maps that realize srGW, showing srGW generalizes classical MDS and equivalently relates to modified Gromov-Hausdorff distances, enabling embeddings into diverse target spaces beyond Euclidean ones. The authors develop SRGW+GD, an efficient algorithm that initializes with a discretized srGW embedding and refines via gradient descent, and demonstrate its effectiveness on MNIST, rotated MNIST, and geodesic-circle embeddings of city data. They further showcase a redistricting application where ensembles of districting plans are visualized on a circle to expose typical patterns and outliers, highlighting the practical utility of manifold-valued dimensionality reduction for complex non-Euclidean data.
Abstract
Dimension reduction techniques typically seek an embedding of a high-dimensional point cloud into a low-dimensional Euclidean space which optimally preserves the geometry of the input data. Based on expert knowledge, one may instead wish to embed the data into some other manifold or metric space in order to better reflect the geometry or topology of the point cloud. We propose a general method for manifold-valued multidimensional scaling based on concepts from optimal transport. In particular, we establish theoretical connections between the recently introduced semi-relaxed Gromov-Wasserstein (srGW) framework and multidimensional scaling by solving the Monge problem in this setting. We also derive novel connections between srGW distance and Gromov-Hausdorff distance. We apply our computational framework to analyze ensembles of political redistricting plans for states with two Congressional districts, achieving an effective visualization of the ensemble as a distribution on a circle which can be used to characterize typical neutral plans, and to flag outliers.
