Table of Contents
Fetching ...

Rank-based loss for learning hierarchical representations

Ines Nolasco, Dan Stowell

TL;DR

The paper addresses learning embeddings that encode hierarchical label structure in audio, especially under incomplete labeling. It introduces a rank-based loss that uses hierarchical rank ordering to set target distances between embeddings, and it generalizes to arbitrarily deep taxonomies while tolerating missing fine-level labels; it is compared against a hierarchical quadruplet loss across three audio datasets. The core contribution is the L formulation and its data-driven target distances, along with an experimental analysis showing robust hierarchical representation learning and some advantages over quadruplet loss, particularly with weaker initial representations. This approach has practical impact for tasks requiring structured, multi-level predictions and robust handling of incomplete annotations in audio domains.

Abstract

Hierarchical taxonomies are common in many contexts, and they are a very natural structure humans use to organise information. In machine learning, the family of methods that use the 'extra' information is called hierarchical classification. However, applied to audio classification, this remains relatively unexplored. Here we focus on how to integrate the hierarchical information of a problem to learn embeddings representative of the hierarchical relationships. Previously, triplet loss has been proposed to address this problem, however it presents some issues like requiring the careful construction of the triplets, and being limited in the extent of hierarchical information it uses at each iteration. In this work we propose a rank based loss function that uses hierarchical information and translates this into a rank ordering of target distances between the examples. We show that rank based loss is suitable to learn hierarchical representations of the data. By testing on unseen fine level classes we show that this method is also capable of learning hierarchically correct representations of the new classes. Rank based loss has two promising aspects, it is generalisable to hierarchies with any number of levels, and is capable of dealing with data with incomplete hierarchical labels.

Rank-based loss for learning hierarchical representations

TL;DR

The paper addresses learning embeddings that encode hierarchical label structure in audio, especially under incomplete labeling. It introduces a rank-based loss that uses hierarchical rank ordering to set target distances between embeddings, and it generalizes to arbitrarily deep taxonomies while tolerating missing fine-level labels; it is compared against a hierarchical quadruplet loss across three audio datasets. The core contribution is the L formulation and its data-driven target distances, along with an experimental analysis showing robust hierarchical representation learning and some advantages over quadruplet loss, particularly with weaker initial representations. This approach has practical impact for tasks requiring structured, multi-level predictions and robust handling of incomplete annotations in audio domains.

Abstract

Hierarchical taxonomies are common in many contexts, and they are a very natural structure humans use to organise information. In machine learning, the family of methods that use the 'extra' information is called hierarchical classification. However, applied to audio classification, this remains relatively unexplored. Here we focus on how to integrate the hierarchical information of a problem to learn embeddings representative of the hierarchical relationships. Previously, triplet loss has been proposed to address this problem, however it presents some issues like requiring the careful construction of the triplets, and being limited in the extent of hierarchical information it uses at each iteration. In this work we propose a rank based loss function that uses hierarchical information and translates this into a rank ordering of target distances between the examples. We show that rank based loss is suitable to learn hierarchical representations of the data. By testing on unseen fine level classes we show that this method is also capable of learning hierarchically correct representations of the new classes. Rank based loss has two promising aspects, it is generalisable to hierarchies with any number of levels, and is capable of dealing with data with incomplete hierarchical labels.

Paper Structure

This paper contains 10 sections, 2 equations, 1 table.