Table of Contents
Fetching ...

Beyond Cosine Similarity

Xinbo Ai

TL;DR

This paper reexamines semantic similarity beyond the conventional cosine metric by deriving a tighter dot-product bound via the Rearrangement Inequality and introducing recos, a similarity measure that normalizes by the sorted–and–reordered dot product $|\mathbf{u}^{\uparrow} \cdot \mathbf{v}^{\updownarrow}|$, thereby capturing ordinal concordance in addition to angular information. The authors formalize a hierarchy of bounds, define three metrics (recos, cos, decos) with distinct saturation conditions, and prove that recos has the broadest capture range. Empirically, recos consistently improves correlation with human judgments on seven STS benchmarks across 11 embedding models, with statistically robust gains, especially for complex and universal embeddings like CLIP-ViT, DPR, and SPECTER. The work demonstrates that ordinal patterns across embedding dimensions carry meaningful semantic signals and can complement traditional angular metrics, offering a principled alternative with practical retrieval implications while noting computational overhead and avenues for scalable approximations.

Abstract

Cosine similarity, the standard metric for measuring semantic similarity in vector spaces, is mathematically grounded in the Cauchy-Schwarz inequality, which inherently limits it to capturing linear relationships--a constraint that fails to model the complex, nonlinear structures of real-world semantic spaces. We advance this theoretical underpinning by deriving a tighter upper bound for the dot product than the classical Cauchy-Schwarz bound. This new bound leads directly to recos, a similarity metric that normalizes the dot product by the sorted vector components. recos relaxes the condition for perfect similarity from strict linear dependence to ordinal concordance, thereby capturing a broader class of relationships. Extensive experiments across 11 embedding models--spanning static, contextualized, and universal types--demonstrate that recos consistently outperforms traditional cosine similarity, achieving higher correlation with human judgments on standard Semantic Textual Similarity (STS) benchmarks. Our work establishes recos as a mathematically principled and empirically superior alternative, offering enhanced accuracy for semantic analysis in complex embedding spaces.

Beyond Cosine Similarity

TL;DR

This paper reexamines semantic similarity beyond the conventional cosine metric by deriving a tighter dot-product bound via the Rearrangement Inequality and introducing recos, a similarity measure that normalizes by the sorted–and–reordered dot product , thereby capturing ordinal concordance in addition to angular information. The authors formalize a hierarchy of bounds, define three metrics (recos, cos, decos) with distinct saturation conditions, and prove that recos has the broadest capture range. Empirically, recos consistently improves correlation with human judgments on seven STS benchmarks across 11 embedding models, with statistically robust gains, especially for complex and universal embeddings like CLIP-ViT, DPR, and SPECTER. The work demonstrates that ordinal patterns across embedding dimensions carry meaningful semantic signals and can complement traditional angular metrics, offering a principled alternative with practical retrieval implications while noting computational overhead and avenues for scalable approximations.

Abstract

Cosine similarity, the standard metric for measuring semantic similarity in vector spaces, is mathematically grounded in the Cauchy-Schwarz inequality, which inherently limits it to capturing linear relationships--a constraint that fails to model the complex, nonlinear structures of real-world semantic spaces. We advance this theoretical underpinning by deriving a tighter upper bound for the dot product than the classical Cauchy-Schwarz bound. This new bound leads directly to recos, a similarity metric that normalizes the dot product by the sorted vector components. recos relaxes the condition for perfect similarity from strict linear dependence to ordinal concordance, thereby capturing a broader class of relationships. Extensive experiments across 11 embedding models--spanning static, contextualized, and universal types--demonstrate that recos consistently outperforms traditional cosine similarity, achieving higher correlation with human judgments on standard Semantic Textual Similarity (STS) benchmarks. Our work establishes recos as a mathematically principled and empirically superior alternative, offering enhanced accuracy for semantic analysis in complex embedding spaces.
Paper Structure (34 sections, 4 theorems, 11 equations, 2 figures, 9 tables)

This paper contains 34 sections, 4 theorems, 11 equations, 2 figures, 9 tables.

Key Result

Theorem 1

For vectors $\mathbf{u}$ and $\mathbf{v}$ we have and the equality conditions are as follows:

Figures (2)

  • Figure 1: Similarity metrics derived from different bounds and their effective capture ranges. A tighter bound (smaller denominator) leads to a more permissive metric with a wider capture range.
  • Figure 2: Performance gains of $\mathrm{recos}$ over $\mathrm{cos}$ and $\mathrm{cos}$ over $\mathrm{decos}$

Theorems & Definitions (15)

  • Definition 1: Similar Vectors
  • Definition 2: Discordant Vectors
  • Definition 3: Vector Ordering
  • Theorem 1: Chain of Inequalities
  • Definition 4: $\mathrm{recos}$
  • Definition 5: $\cos$
  • Definition 6: $\mathrm{decos}$
  • Definition 7: Tanimoto Similarity
  • Corollary 1: Bounds and Saturation Conditions
  • Corollary 2: Metric Hierarchy
  • ...and 5 more