Table of Contents
Fetching ...

LORE: Jointly Learning the Intrinsic Dimensionality and Relative Similarity Structure From Ordinal Data

Vivek Anand, Alec Helbling, Mark Davenport, Gordon Berman, Sankar Alagapan, Christopher Rozell

TL;DR

LORE addresses the problem of learning subjective perceptual spaces from ordinal triplets by jointly inferring the embedding and its intrinsic dimensionality. It couples a smoothed triplet loss with a nonconvex Schatten-$p$ quasi-norm regularizer, optimized via an iteratively reweighted scheme that provably converges to a stationary point. Across synthetic, simulated, and real crowdsourced data, LORE achieves high triplet accuracy while recovering a low, interpretable intrinsic rank, outperforming baselines in rank recovery and producing semantically meaningful axes. This approach enables scalable, data-efficient perceptual modeling with interpretable latent structure for psychophysics and related domains.

Abstract

Learning the intrinsic dimensionality of subjective perceptual spaces such as taste, smell, or aesthetics from ordinal data is a challenging problem. We introduce LORE (Low Rank Ordinal Embedding), a scalable framework that jointly learns both the intrinsic dimensionality and an ordinal embedding from noisy triplet comparisons of the form, "Is A more similar to B than C?". Unlike existing methods that require the embedding dimension to be set apriori, LORE regularizes the solution using the nonconvex Schatten-$p$ quasi norm, enabling automatic joint recovery of both the ordinal embedding and its dimensionality. We optimize this joint objective via an iteratively reweighted algorithm and establish convergence guarantees. Extensive experiments on synthetic datasets, simulated perceptual spaces, and real world crowdsourced ordinal judgements show that LORE learns compact, interpretable and highly accurate low dimensional embeddings that recover the latent geometry of subjective percepts. By simultaneously inferring both the intrinsic dimensionality and ordinal embeddings, LORE enables more interpretable and data efficient perceptual modeling in psychophysics and opens new directions for scalable discovery of low dimensional structure from ordinal data in machine learning.

LORE: Jointly Learning the Intrinsic Dimensionality and Relative Similarity Structure From Ordinal Data

TL;DR

LORE addresses the problem of learning subjective perceptual spaces from ordinal triplets by jointly inferring the embedding and its intrinsic dimensionality. It couples a smoothed triplet loss with a nonconvex Schatten- quasi-norm regularizer, optimized via an iteratively reweighted scheme that provably converges to a stationary point. Across synthetic, simulated, and real crowdsourced data, LORE achieves high triplet accuracy while recovering a low, interpretable intrinsic rank, outperforming baselines in rank recovery and producing semantically meaningful axes. This approach enables scalable, data-efficient perceptual modeling with interpretable latent structure for psychophysics and related domains.

Abstract

Learning the intrinsic dimensionality of subjective perceptual spaces such as taste, smell, or aesthetics from ordinal data is a challenging problem. We introduce LORE (Low Rank Ordinal Embedding), a scalable framework that jointly learns both the intrinsic dimensionality and an ordinal embedding from noisy triplet comparisons of the form, "Is A more similar to B than C?". Unlike existing methods that require the embedding dimension to be set apriori, LORE regularizes the solution using the nonconvex Schatten- quasi norm, enabling automatic joint recovery of both the ordinal embedding and its dimensionality. We optimize this joint objective via an iteratively reweighted algorithm and establish convergence guarantees. Extensive experiments on synthetic datasets, simulated perceptual spaces, and real world crowdsourced ordinal judgements show that LORE learns compact, interpretable and highly accurate low dimensional embeddings that recover the latent geometry of subjective percepts. By simultaneously inferring both the intrinsic dimensionality and ordinal embeddings, LORE enables more interpretable and data efficient perceptual modeling in psychophysics and opens new directions for scalable discovery of low dimensional structure from ordinal data in machine learning.
Paper Structure (29 sections, 1 theorem, 12 equations, 17 figures, 4 tables, 1 algorithm)

This paper contains 29 sections, 1 theorem, 12 equations, 17 figures, 4 tables, 1 algorithm.

Key Result

Theorem 1

The sequence of OEs generated by the LORE algorithm $\{\mathbf{Z}^k\}_{k=1,2,3,\dots}$ converges. i.e.

Figures (17)

  • Figure 1: LORE jointly learns both the intrinsic dimensionality and relative similarities by balancing dimensionality with similarity constraints: Other methods require the embedding dimension to be chosen in advance, making them less data driven and often suboptimal.
  • Figure 2: LORE has high test triplet accuracy and intrinsic rank recovery across varying number of queries. (Left) Mean test triplet accuracy vs $\lambda$ for LORE as Fraction of Queries varies. (Right) Mean measured rank vs $\lambda$ for LORE as Fraction of Queries varies.
  • Figure 3: Only LORE can recover the intrinsic rank while maintaining comparable test triplet accuracy as number of queries varies: (Left) Mean test triplet accuracy vs fraction of queries used. (Center) Mean measured rank vs fraction of queries used. (Right) Mean Measured Rank vs Intrinsic Rank. The gray dotted line indicates the ideal case where the measured rank is equal to the intrinsic rank. Shaded Areas indicate $\pm 2$ Standard Deviations.
  • Figure 4: LORE outperforms baselines for both test triplet accuracy and intrinsic rank for a simulated LLM perceptual experiment. (Left) Mean test triplet accuracy vs intrinsic rank. (Center) Mean measured rank vs intrinsic rank. The gray dotted line is the ideal case where the measured rank is equal to the intrinsic rank. (Right) Time taken for processing vs intrinsic rank. Shaded Areas indicate $\pm 2$ Standard Deviations.
  • Figure 5: LORE's learned axes are semantically interpretable: Food groups as axis value varies for the first three learned axes of the LORE embedding learned on the Food-100 dataset. (Same embedding as one learned for Table \ref{['table:real_datasets']}).
  • ...and 12 more figures

Theorems & Definitions (2)

  • Theorem : LORE converges to a stationary point
  • proof