Table of Contents
Fetching ...

Taxes Are All You Need: Integration of Taxonomical Hierarchy Relationships into the Contrastive Loss

Kiran Kokilepersaud, Yavuz Yarici, Mohit Prabhushankar, Ghassan AlRegib

TL;DR

The paper addresses the limitation of conventional supervised contrastive learning by incorporating taxonomic hierarchy into the loss function. It introduces TaxCL, which splits negatives into taxonomy-based and regular sets and applies a weighting scheme to emphasize taxonomic negatives, with optional combination with the standard SupCon loss. Experiments on CIFAR-100, OLIVES, and Cure-OR show TaxCL and especially the combined loss outperform baselines, with notable improvements in noisy or medically relevant data. The analysis links representation-space structure, including dimensional collapse and cosine similarity patterns, to the benefits of taxonomy-aware regularization, underscoring the method's generality across domains.

Abstract

In this work, we propose a novel supervised contrastive loss that enables the integration of taxonomic hierarchy information during the representation learning process. A supervised contrastive loss operates by enforcing that images with the same class label (positive samples) project closer to each other than images with differing class labels (negative samples). The advantage of this approach is that it directly penalizes the structure of the representation space itself. This enables greater flexibility with respect to encoding semantic concepts. However, the standard supervised contrastive loss only enforces semantic structure based on the downstream task (i.e. the class label). In reality, the class label is only one level of a \emph{hierarchy of different semantic relationships known as a taxonomy}. For example, the class label is oftentimes the species of an animal, but between different classes there are higher order relationships such as all animals with wings being ``birds". We show that by explicitly accounting for these relationships with a weighting penalty in the contrastive loss we can out-perform the supervised contrastive loss. Additionally, we demonstrate the adaptability of the notion of a taxonomy by integrating our loss into medical and noise-based settings that show performance improvements by as much as 7%.

Taxes Are All You Need: Integration of Taxonomical Hierarchy Relationships into the Contrastive Loss

TL;DR

The paper addresses the limitation of conventional supervised contrastive learning by incorporating taxonomic hierarchy into the loss function. It introduces TaxCL, which splits negatives into taxonomy-based and regular sets and applies a weighting scheme to emphasize taxonomic negatives, with optional combination with the standard SupCon loss. Experiments on CIFAR-100, OLIVES, and Cure-OR show TaxCL and especially the combined loss outperform baselines, with notable improvements in noisy or medically relevant data. The analysis links representation-space structure, including dimensional collapse and cosine similarity patterns, to the benefits of taxonomy-aware regularization, underscoring the method's generality across domains.

Abstract

In this work, we propose a novel supervised contrastive loss that enables the integration of taxonomic hierarchy information during the representation learning process. A supervised contrastive loss operates by enforcing that images with the same class label (positive samples) project closer to each other than images with differing class labels (negative samples). The advantage of this approach is that it directly penalizes the structure of the representation space itself. This enables greater flexibility with respect to encoding semantic concepts. However, the standard supervised contrastive loss only enforces semantic structure based on the downstream task (i.e. the class label). In reality, the class label is only one level of a \emph{hierarchy of different semantic relationships known as a taxonomy}. For example, the class label is oftentimes the species of an animal, but between different classes there are higher order relationships such as all animals with wings being ``birds". We show that by explicitly accounting for these relationships with a weighting penalty in the contrastive loss we can out-perform the supervised contrastive loss. Additionally, we demonstrate the adaptability of the notion of a taxonomy by integrating our loss into medical and noise-based settings that show performance improvements by as much as 7%.
Paper Structure (10 sections, 4 equations, 9 figures, 2 tables)

This paper contains 10 sections, 4 equations, 9 figures, 2 tables.

Figures (9)

  • Figure 1: Every dataset has semantic dependencies that exist beyond just the basic task label. This shows examples on the datasets OLIVES prabhushankar2022olives, Cure-ORtemel2017cure, and on natural images.
  • Figure 2: a) When contrasting against negatives within the same taxonomy, the model is forced to learn more fine-grained differentiating features in order to rectify the difference between the two. b) Traditional approaches treat all negatives equally and do not consider the importance of the taxonomic negatives. We address this through additional importance weighting on taxonomic negatives.
  • Figure 3: This shows the SVD spectrum of the covariance matrix of subsets of the Cifar-100 test set from a model trained with SupCon khosla2020supervised. We show the spectrum for the test set as a whole as well as for a subset consisting of images belonging to the vehicle superclass as well as an equally size random assortment of images.
  • Figure 4: This plot shows the average pairwise cosine similarity between the anchor image and taxonomy and regular negatives.
  • Figure 5: For each anchor image from Cifar-100, we retrieve the images with the highest cosine similarity in the same batch. We show the taxonomic grouping label for each image.
  • ...and 4 more figures