Table of Contents
Fetching ...

Feature Space Topology Control via Hopkins Loss

Einari Vaaras, Manu Airaksinen

TL;DR

This work introduces Hopkins loss, a differentiable loss based on the Hopkins statistic $H$ that can steer feature space topology toward a user-defined target $H_T$, enabling regularly-spaced, randomly-spaced, or clustered arrangements of features. By integrating $L_H$ with standard classification losses, the authors demonstrate that model performance is largely preserved across speech, text, and image tasks while the feature topology can be guided toward specific structures. In autoencoder-based dimensionality reduction, topology control via $L_H$ produces larger shifts in $H$ with modest or acceptable drops in downstream classification accuracy, highlighting its potential for visualization and data compression. The results suggest practical utility for topology-aware representations in applications such as generative modeling, transfer learning, and robustness, with future work focusing on broader architectures, additional distance metrics, and more diverse domains.

Abstract

Feature space topology refers to the organization of samples within the feature space. Modifying this topology can be beneficial in machine learning applications, including dimensionality reduction, generative modeling, transfer learning, and robustness to adversarial attacks. This paper introduces a novel loss function, Hopkins loss, which leverages the Hopkins statistic to enforce a desired feature space topology, which is in contrast to existing topology-related methods that aim to preserve input feature topology. We evaluate the effectiveness of Hopkins loss on speech, text, and image data in two scenarios: classification and dimensionality reduction using nonlinear bottleneck autoencoders. Our experiments show that integrating Hopkins loss into classification or dimensionality reduction has only a small impact on classification performance while providing the benefit of modifying feature topology.

Feature Space Topology Control via Hopkins Loss

TL;DR

This work introduces Hopkins loss, a differentiable loss based on the Hopkins statistic that can steer feature space topology toward a user-defined target , enabling regularly-spaced, randomly-spaced, or clustered arrangements of features. By integrating with standard classification losses, the authors demonstrate that model performance is largely preserved across speech, text, and image tasks while the feature topology can be guided toward specific structures. In autoencoder-based dimensionality reduction, topology control via produces larger shifts in with modest or acceptable drops in downstream classification accuracy, highlighting its potential for visualization and data compression. The results suggest practical utility for topology-aware representations in applications such as generative modeling, transfer learning, and robustness, with future work focusing on broader architectures, additional distance metrics, and more diverse domains.

Abstract

Feature space topology refers to the organization of samples within the feature space. Modifying this topology can be beneficial in machine learning applications, including dimensionality reduction, generative modeling, transfer learning, and robustness to adversarial attacks. This paper introduces a novel loss function, Hopkins loss, which leverages the Hopkins statistic to enforce a desired feature space topology, which is in contrast to existing topology-related methods that aim to preserve input feature topology. We evaluate the effectiveness of Hopkins loss on speech, text, and image data in two scenarios: classification and dimensionality reduction using nonlinear bottleneck autoencoders. Our experiments show that integrating Hopkins loss into classification or dimensionality reduction has only a small impact on classification performance while providing the benefit of modifying feature topology.

Paper Structure

This paper contains 11 sections, 3 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: An example 2D visualization of regularly-spaced (left, $H \in [0.01,0.3]$), randomly-spaced (middle, $H \approx 0.5$), and clustered data (right, $H \in [0.7,0.99]$) using the distance metric $D = \text{Chebyshev}$.
  • Figure 2: The results of the classification experiments. The Hopkins loss target values $H_\text{T} = 0.01$, $H_\text{T} = 0.5$, and $H_\text{T} = 0.99$ aim at learning regularly-spaced, randomly-spaced, and clustered features, respectively. Statistically significant differences are reported using the Mann-Whitney U test mann_whitney_u_test_original with either $*$ ($p < 0.05$), $**$ ($p < 0.01$), or $***$ ($p < 0.001$). Results significantly higher than the baseline are marked with blue, and vice versa with red. The number below each box plot is the mean $H$ value ($\pm$95% confidence interval).
  • Figure 3: The results of the AE experiments (left: bottleneck feature dimensionality $B=32$, middle: $B=8$, right: $B=2$). The Hopkins loss target values $H_\text{T} = 0.01$, $H_\text{T} = 0.5$, and $H_\text{T} = 0.99$ aim at learning regularly-spaced, randomly-spaced, and clustered features, respectively. Statistically significant differences are reported using the Mann-Whitney U test mann_whitney_u_test_original with either $*$ ($p < 0.05$), $**$ ($p < 0.01$), or $***$ ($p < 0.001$). Results significantly higher than the baseline are marked with blue, and vice versa with red. The number below each box plot is the mean $H$ value ($\pm$95% confidence interval).
  • Figure 4: Example of a 0.10 difference in $H$ value when bottleneck feature dimensionality $B = 2$ for the RAVDESS dataset. The bottleneck features of a randomly initialized model (left, $H=0.89$), a trained model with $L_\text{H}$ and $H_\text{T}=0.5$ (middle, $H=0.79$), and a trained model with $L_\text{H}$ and $H_\text{T}=0.99$ (right, $H=0.99$) are shown.