Table of Contents
Fetching ...

Uncovering Temporal Patterns in Visualizations of High-Dimensional Data

Pavlin G. Poličar, Blaž Zupan

TL;DR

Temporal data visualization benefits from embeddings that reveal progression, which standard DR methods often miss. The authors extend t-SNE with two losses, Directional Coherence Loss ($L_{\text{DCL}}$) and Edge Length Loss ($L_{\text{ELL}}$), resulting in a combined objective $L = L_{\text{t-SNE}} + \lambda L_{\text{DCL}} + \mu L_{\text{ELL}}$ that emphasizes temporal trajectories while preserving data topology. Across synthetic and real-world datasets, the approach improves temporal coherence metrics and reveals cyclic and directional patterns, with guidelines for parameter settings to balance fidelity and readability. The work provides a practical, temporally-aware DR framework that can enhance interpretation and communication of dynamic data, along with quantitative evaluation methodologies anchored in both dimensionality reduction and graph-visualization literature.

Abstract

With the increasing availability of high-dimensional data, analysts often rely on exploratory data analysis to understand complex data sets. A key approach to exploring such data is dimensionality reduction, which embeds high-dimensional data in two dimensions to enable visual exploration. However, popular embedding techniques, such as t-SNE and UMAP, typically assume that data points are independent. When this assumption is violated, as in time-series data, the resulting visualizations may fail to reveal important temporal patterns and trends. To address this, we propose a formal extension to existing dimensionality reduction methods that incorporates two temporal loss terms that explicitly highlight temporal progression in the embedded visualizations. Through a series of experiments on both synthetic and real-world datasets, we demonstrate that our approach effectively uncovers temporal patterns and improves the interpretability of the visualizations. Furthermore, the method improves temporal coherence while preserving the fidelity of the embeddings, providing a robust tool for dynamic data analysis.

Uncovering Temporal Patterns in Visualizations of High-Dimensional Data

TL;DR

Temporal data visualization benefits from embeddings that reveal progression, which standard DR methods often miss. The authors extend t-SNE with two losses, Directional Coherence Loss () and Edge Length Loss (), resulting in a combined objective that emphasizes temporal trajectories while preserving data topology. Across synthetic and real-world datasets, the approach improves temporal coherence metrics and reveals cyclic and directional patterns, with guidelines for parameter settings to balance fidelity and readability. The work provides a practical, temporally-aware DR framework that can enhance interpretation and communication of dynamic data, along with quantitative evaluation methodologies anchored in both dimensionality reduction and graph-visualization literature.

Abstract

With the increasing availability of high-dimensional data, analysts often rely on exploratory data analysis to understand complex data sets. A key approach to exploring such data is dimensionality reduction, which embeds high-dimensional data in two dimensions to enable visual exploration. However, popular embedding techniques, such as t-SNE and UMAP, typically assume that data points are independent. When this assumption is violated, as in time-series data, the resulting visualizations may fail to reveal important temporal patterns and trends. To address this, we propose a formal extension to existing dimensionality reduction methods that incorporates two temporal loss terms that explicitly highlight temporal progression in the embedded visualizations. Through a series of experiments on both synthetic and real-world datasets, we demonstrate that our approach effectively uncovers temporal patterns and improves the interpretability of the visualizations. Furthermore, the method improves temporal coherence while preserving the fidelity of the embeddings, providing a robust tool for dynamic data analysis.
Paper Structure (20 sections, 12 equations, 15 figures, 1 table)

This paper contains 20 sections, 12 equations, 15 figures, 1 table.

Figures (15)

  • Figure 1: A simple example of a two-dimensional embedding of a cyclic data set with arrows connecting consecutive data points. The panel on the left shows a visualization constructed by embedding the data first and then adding arrows to connect consecutive time-points, while the panel on the right shows the visualization constructed by embedding the data with our proposed embedding that jointly optimizes point positions with respect to the data and temporal information. The cyclic structure is clearly visible in the depiction on the right, and not discernible in the depiction on the left panel.
  • Figure 2: The three-dimensional synthetic data sets used throughout the evaluation.
  • Figure 3: The effects of different DCL strengths $\lambda$ on embedding quality. The metrics in the top row relate to the fidelity of the embedding, while the metrics in the bottom row quantify the visual temporal coherence of the resulting visualizations.
  • Figure 4: The effects of different DCL strengths $\lambda$ on resulting visualizations. The top row shows a standard t-SNE embedding for reference, while the rows below show the effects of increasing $\lambda$. Point colors indicate temporal progression, except in the COIL-20 data set, in which colors correspond to the 20 different classes. The scale of the embedding is indicated with the red line in the bottom left of each panel. As we increase the DCL strength $\lambda$, many inherently synthetic, two-dimensional manifolds become more apparent. For instance, the Cyclic Groups and Swiss Roll data sets suddenly unravel, and the underlying structure is uncovered. However, using larger values of $\lambda$ can overpower the t-SNE loss and lead to non-sensical visualizations, e.g., the bottom panel of the COIL-20 data set.
  • Figure 5: The effects of different DCL scales $s$ on embedding quality.
  • ...and 10 more figures