Table of Contents
Fetching ...

Pattern or Artifact? Interactively Exploring Embedding Quality with TRACE

Edith Heiter, Liesbet Martens, Ruth Seurinck, Martin Guilliams, Tijl De Bie, Yvan Saeys, Jefrey Lijffijt

TL;DR

Dimensionality reduction often trades off local versus global structure, leading to misleading interpretations of embeddings. TRACE provides a scalable, interactive framework to evaluate and compare embedding quality by computing both global structural metrics and local neighborhood preservation, with a precomputation pipeline and a browser-based UI. The approach supports global structure analysis via distance-rank and triplet-based measures, local distortion analysis through neighborhood preservation visuals, and cross-embedding comparisons by tracking point stability, enabling informed method selection. Built on a Python backend and a regl-scatterplot frontend, TRACE scales to large datasets using approximate nearest neighbors and Numba, offering practical insights for choosing suitable dimensionality reduction strategies in real-world analyses.

Abstract

This paper presents TRACE, a tool to analyze the quality of 2D embeddings generated through dimensionality reduction techniques. Dimensionality reduction methods often prioritize preserving either local neighborhoods or global distances, but insights from visual structures can be misleading if the objective has not been achieved uniformly. TRACE addresses this challenge by providing a scalable and extensible pipeline for computing both local and global quality measures. The interactive browser-based interface allows users to explore various embeddings while visually assessing the pointwise embedding quality. The interface also facilitates in-depth analysis by highlighting high-dimensional nearest neighbors for any group of points and displaying high-dimensional distances between points. TRACE enables analysts to make informed decisions regarding the most suitable dimensionality reduction method for their specific use case, by showing the degree and location where structure is preserved in the reduced space.

Pattern or Artifact? Interactively Exploring Embedding Quality with TRACE

TL;DR

Dimensionality reduction often trades off local versus global structure, leading to misleading interpretations of embeddings. TRACE provides a scalable, interactive framework to evaluate and compare embedding quality by computing both global structural metrics and local neighborhood preservation, with a precomputation pipeline and a browser-based UI. The approach supports global structure analysis via distance-rank and triplet-based measures, local distortion analysis through neighborhood preservation visuals, and cross-embedding comparisons by tracking point stability, enabling informed method selection. Built on a Python backend and a regl-scatterplot frontend, TRACE scales to large datasets using approximate nearest neighbors and Numba, offering practical insights for choosing suitable dimensionality reduction strategies in real-world analyses.

Abstract

This paper presents TRACE, a tool to analyze the quality of 2D embeddings generated through dimensionality reduction techniques. Dimensionality reduction methods often prioritize preserving either local neighborhoods or global distances, but insights from visual structures can be misleading if the objective has not been achieved uniformly. TRACE addresses this challenge by providing a scalable and extensible pipeline for computing both local and global quality measures. The interactive browser-based interface allows users to explore various embeddings while visually assessing the pointwise embedding quality. The interface also facilitates in-depth analysis by highlighting high-dimensional nearest neighbors for any group of points and displaying high-dimensional distances between points. TRACE enables analysts to make informed decisions regarding the most suitable dimensionality reduction method for their specific use case, by showing the degree and location where structure is preserved in the reduced space.
Paper Structure (9 sections, 3 figures)

This paper contains 9 sections, 3 figures.

Figures (3)

  • Figure 1: The TRACE user interface showing an embedding where points are colored according to neighborhood preservation.
  • Figure 2: Implementation overview of TRACE where embeddings and quality measures are precomputed before loading them with the backend.
  • Figure 3: The time to compute all quality measures for five embeddings is comparable to running t-SNE (perplexity 30, 800 iterations). We used an Intel Xeon Gold 6136 CPU, using a maximum of 12 cores.