Table of Contents
Fetching ...

Learning Locally Adaptive Metrics that Enhance Structural Representation with $\texttt{LAMINAR}$

Christian Kleiber, William H. Oliver, Tobias Buck

TL;DR

A novel unsupervised machine learning pipeline designed to enhance the representation of structure within data via producing a more-informative distance metric, which is a locally-adaptive-metric that produces structurally-informative density-based distances.

Abstract

We present $\texttt{LAMINAR}$, a novel unsupervised machine learning pipeline designed to enhance the representation of structure within data via producing a more-informative distance metric. Analysis methods in the physical sciences often rely on standard metrics to define geometric relationships in data, which may fail to capture the underlying structure of complex data sets. $\texttt{LAMINAR}$ addresses this by using a continuous-normalising-flow and inverse-transform-sampling to define a Riemannian manifold in the data space without the need for the user to specify a metric over the data a-priori. The result is a locally-adaptive-metric that produces structurally-informative density-based distances. We demonstrate the utility of $\texttt{LAMINAR}$ by comparing its output to the Euclidean metric for structured data sets.

Learning Locally Adaptive Metrics that Enhance Structural Representation with $\texttt{LAMINAR}$

TL;DR

A novel unsupervised machine learning pipeline designed to enhance the representation of structure within data via producing a more-informative distance metric, which is a locally-adaptive-metric that produces structurally-informative density-based distances.

Abstract

We present , a novel unsupervised machine learning pipeline designed to enhance the representation of structure within data via producing a more-informative distance metric. Analysis methods in the physical sciences often rely on standard metrics to define geometric relationships in data, which may fail to capture the underlying structure of complex data sets. addresses this by using a continuous-normalising-flow and inverse-transform-sampling to define a Riemannian manifold in the data space without the need for the user to specify a metric over the data a-priori. The result is a locally-adaptive-metric that produces structurally-informative density-based distances. We demonstrate the utility of by comparing its output to the Euclidean metric for structured data sets.

Paper Structure

This paper contains 8 sections, 5 equations, 6 figures.

Figures (6)

  • Figure 1: The original (uniform) distribution and the transformations applied to it.
  • Figure 2: Comparison of ground-truth and LAMINAR metric tensors produced using data in Fig. \ref{['fig:transformations_new']}.
  • Figure 3: A reference colour wheel for the visualisation of the metric tensor in Fig. \ref{['fig:metric_vis']}. To assign a colour to a data point, we first create an ellipse by transforming a circle with that point's metric tensor. The colour (angle) assigned is given by the orientation of this ellipse, i.e. red (blue) if its major axis aligns horizontally (vertically). The saturation (radius) of this colour is determined by the ratio between the lengths of the major and minor axes -- so that a more spherical ellipse is lighter in colour. Visualising the metric in this way shows us the direction in, and degree to, which the distance function increases most (from each data point).
  • Figure 4: The distance distribution from a query point (red cross) on four toy data sets found with the LAMINAR (top) and Euclidean (bottom) metrics. The points are coloured according to the viridis colour map. The grey-scale contours show an estimate for the out-of-distribution distances.
  • Figure 5: A comparison of distances from query points (marked with a green cross) produced via the LAMINAR and Euclidean metrics (as shown in Fig \ref{['fig:LAMvsEuc']}). Here blue points mark those with smaller LAMINAR distances compared to their Euclidean counterparts, and vice versa for red points.
  • ...and 1 more figures