Table of Contents
Fetching ...

Squintability and Other Metrics for Assessing Projection Pursuit Indexes, and Guiding Optimization Choices

H. Sherry Zhang, Dianne Cook, Nicolas Langrené, Jessica Wai Yin Leung

TL;DR

The paper tackles optimizing projection pursuit (PP) indexes within the Projection Pursuit Guided Tour framework, addressing challenges from noisy surfaces and small squint angles. It introduces two quantitative index properties—smoothness, captured via a Gaussian process with a Matérn covariance and summarized by the smoothness parameter $\nu$, and squintability, defined as a normalized gain along projection distance using a parametric logistic model—and uses these to analyze optimizer performance. The Jellyfish Search Optimizer (JSO) is proposed and evaluated against Creeping Random Search (CRS) across multiple indexes, dimensions $d$, and hyper-parameters, finding that higher squintability and more jellyfish improve success while smoothness has limited impact. Implemented in the R packages tourr and ferrn, the approach provides a practical toolkit for selecting PP indexes and optimizers, and offers guidance for computing index properties in new indexes.

Abstract

The projection pursuit (PP) guided tour optimizes a criterion function, known as the PP index, to gradually reveal projections of interest from high-dimensional data through animation. Optimization of some PP indexes can be non-trivial, if they are non-smooth functions, or when the optimum has a small "squint angle", detectable only from close proximity. Here, measures for calculating the smoothness and squintability properties of the PP index are defined. These are used to investigate the performance of a recently introduced swarm-based algorithm, Jellyfish Search Optimizer (JSO), for optimizing PP indexes. The performance of JSO in detecting the target pattern (pipe shape) is compared with existing optimizers in PP. Additionally, JSO's performance on detecting the sine-wave shape is evaluated using different PP indexes (hence different smoothness and squintability) across various data dimensions (d = 4, 6, 8, 10, 12) and JSO hyper-parameters. We observe empirically that higher squintability improves the success rate of the PP index optimization, while smoothness has no significant effect. The JSO algorithm has been implemented in the R package, `tourr`, and functions to calculate smoothness and squintability measures are implemented in the `ferrn` package.

Squintability and Other Metrics for Assessing Projection Pursuit Indexes, and Guiding Optimization Choices

TL;DR

The paper tackles optimizing projection pursuit (PP) indexes within the Projection Pursuit Guided Tour framework, addressing challenges from noisy surfaces and small squint angles. It introduces two quantitative index properties—smoothness, captured via a Gaussian process with a Matérn covariance and summarized by the smoothness parameter , and squintability, defined as a normalized gain along projection distance using a parametric logistic model—and uses these to analyze optimizer performance. The Jellyfish Search Optimizer (JSO) is proposed and evaluated against Creeping Random Search (CRS) across multiple indexes, dimensions , and hyper-parameters, finding that higher squintability and more jellyfish improve success while smoothness has limited impact. Implemented in the R packages tourr and ferrn, the approach provides a practical toolkit for selecting PP indexes and optimizers, and offers guidance for computing index properties in new indexes.

Abstract

The projection pursuit (PP) guided tour optimizes a criterion function, known as the PP index, to gradually reveal projections of interest from high-dimensional data through animation. Optimization of some PP indexes can be non-trivial, if they are non-smooth functions, or when the optimum has a small "squint angle", detectable only from close proximity. Here, measures for calculating the smoothness and squintability properties of the PP index are defined. These are used to investigate the performance of a recently introduced swarm-based algorithm, Jellyfish Search Optimizer (JSO), for optimizing PP indexes. The performance of JSO in detecting the target pattern (pipe shape) is compared with existing optimizers in PP. Additionally, JSO's performance on detecting the sine-wave shape is evaluated using different PP indexes (hence different smoothness and squintability) across various data dimensions (d = 4, 6, 8, 10, 12) and JSO hyper-parameters. We observe empirically that higher squintability improves the success rate of the PP index optimization, while smoothness has no significant effect. The JSO algorithm has been implemented in the R package, `tourr`, and functions to calculate smoothness and squintability measures are implemented in the `ferrn` package.
Paper Structure (20 sections, 8 equations, 10 figures, 2 tables)

This paper contains 20 sections, 8 equations, 10 figures, 2 tables.

Figures (10)

  • Figure 1: Examples of PP indexes with large (top row) and small (bottom row) squint angles, shown with a Huber plot, and histogram of the projected data corresponding to the optimal projection. A Huber plot shows the PP index values for all 1D data projections in polar coordinates. The skewness index on the trimodal data is also smoother than the binned normality index on RANDU.
  • Figure 2: Five random simulations from a Gaussian Process defined on $\mathbb{R}$ with zero mean and Matérn-$\nu$ covariance function, with $\nu=1$ (left), $\nu=2$ (middle), and $\nu=4$ (right), showing that higher values of $\nu$ produce smoother curves.
  • Figure 3: One random simulation from a Gaussian Process defined on $\mathbb{R}^2$ with zero mean and Matérn-$\nu$ covariance function, with $\nu=1$ (left), $\nu=2$ (middle), and $\nu=4$ (right), showing that higher values of $\nu$ produce smoother surfaces.
  • Figure 4: Simulated traces of index value as a function of projection distances for four PP index functions: MIC, splines2d, skinny, and stringy2. The index function MIC and splines2d make early progression towards the optimal index value, indicating a large squint angle, whereas the traces from skinny, and specially stringy2, show improvement only near the optimal projection, suggesting low squintability.
  • Figure 5: How success rate is calculated, illustrated using the optimal projections from 50 optimizations of 8D pipe data, optimized by CRS, sorted by index value. The pipe shape is recognizable in the projection index values between 0.933-0.969. Of the 50 simulations, 43 achieved an index value within 0.05 of the best, resulting in a success rate of 0.86.
  • ...and 5 more figures

Theorems & Definitions (6)

  • Definition 1: Sobolev space
  • Definition 2: Matérn covariance function
  • Definition 3
  • Definition 4: squint angle
  • Definition 5: projection distance
  • Definition 6: squintability