DREAMS: Preserving both Local and Global Structure in Dimensionality Reduction
Noël Kury, Dmitry Kobak, Sebastian Damrich
TL;DR
DREAMS addresses the dichotomy that standard dimensionality reduction methods treat local and global data structure separately by introducing a PCA/MDS-based regularization term into the $t$-SNE objective, yielding a controllable spectrum of embeddings between highly local and globally coherent layouts. The method augments the objective with $\igl(1-\lambda\) \mathcal{L}_{t\text{-SNE}}(Y) + (\lambda/n) \|Y - \alpha \tilde{Y}\|_F^2$, enabling balanced preservation of structure across scales; $\lambda$ tunes the local-global trade-off and $\alpha$ scales the global embedding to the current embedding's magnitude. The authors benchmark DREAMS against a wide range of baselines on eleven real-world datasets, showing that DREAMS often achieves the best combined local-global preservation, with DREAMS-MDS offering alternative global references. A key finding is that a modest regularization strength ($\lambda \approx 0.15$) yields embeddings that retain both fine-grained clusters and broad groupings, improving interpretability for hierarchical data such as single-cell transcriptomics. The work provides open-source implementations and demonstrates the method's robustness and adaptability, though it notes trade-offs in runtime and the absence of formal guarantees for balance across all datasets.
Abstract
Dimensionality reduction techniques are widely used for visualizing high-dimensional data in two dimensions. Existing methods are typically designed to preserve either local (e.g., $t$-SNE, UMAP) or global (e.g., MDS, PCA) structure of the data, but none of the established methods can represent both aspects well. In this paper, we present DREAMS (Dimensionality Reduction Enhanced Across Multiple Scales), a method that combines the local structure preservation of $t$-SNE with the global structure preservation of PCA via a simple regularization term. Our approach generates a spectrum of embeddings between the locally well-structured $t$-SNE embedding and the globally well-structured PCA embedding, efficiently balancing both local and global structure preservation. We benchmark DREAMS across eleven real-world datasets, showcasing qualitatively and quantitatively its superior ability to preserve structure across multiple scales compared to previous approaches.
