SPreV
Srivathsan Amruth
TL;DR
SPREV introduces a geometry-driven embedding for labeled, high-dimensional, small-sample datasets by encasing the data in a convex hull and an enclosing sphere, then projecting same-class points onto the sphere surface and mapping to a 2D regular polygon using a distance-based similarity matrix. This approach aims to preserve global structure like PCA while encoding metric relations akin to t-SNE/UMAP, delivering fast embeddings suitable for small class sizes and low samples. Empirical evaluations on MNIST, Fashion-MNIST, COIL-20, CIFAR-100, and culled variants show SPREV generally preserves non-local structure better than PCA and competes with, or exceeds, PCA in speed and central clustering tendencies, while remaining competitive with t-SNE/UMAP in local structure in many cases. The work also provides a comprehensive framework for visualization, qualitative and quantitative benchmarking, and outlines future directions including multicore optimization, larger-class scalability, and unsupervised extensions to broaden applicability in real-world data exploration and decision support.
Abstract
SPREV, short for hyperSphere Reduced to two-dimensional Regular Polygon for Visualisation, is a novel dimensionality reduction technique developed to address the challenges of reducing dimensions and visualizing labeled datasets that exhibit a unique combination of three characteristics: small class size, high dimensionality, and low sample size. SPREV is designed not only to uncover but also to visually represent hidden patterns within such datasets. Its distinctive integration of geometric principles, adapted for discrete computational environments, makes it an indispensable tool in the modern data science toolkit, enabling users to identify trends, extract insights, and navigate complex data efficiently and effectively.
