Table of Contents
Fetching ...

Angular Distance Distribution Loss for Audio Classification

Antonio Almudévar, Romain Serizel, Alfonso Ortega

TL;DR

The Angular Distance Distribution (ADD) Loss is proposed, which aims to enhance the four previous properties jointly and imposes conditions on the first and second order statistical moments of the angular distance between embeddings.

Abstract

Classification is a pivotal task in deep learning not only because of its intrinsic importance, but also for providing embeddings with desirable properties in other tasks. To optimize these properties, a wide variety of loss functions have been proposed that attempt to minimize the intra-class distance and maximize the inter-class distance in the embeddings space. In this paper we argue that, in addition to these two, eliminating hierarchies within and among classes are two other desirable properties for classification embeddings. Furthermore, we propose the Angular Distance Distribution (ADD) Loss, which aims to enhance the four previous properties jointly. For this purpose, it imposes conditions on the first and second order statistical moments of the angular distance between embeddings. Finally, we perform experiments showing that our loss function improves all four properties and, consequently, performs better than other loss functions in audio classification tasks.

Angular Distance Distribution Loss for Audio Classification

TL;DR

The Angular Distance Distribution (ADD) Loss is proposed, which aims to enhance the four previous properties jointly and imposes conditions on the first and second order statistical moments of the angular distance between embeddings.

Abstract

Classification is a pivotal task in deep learning not only because of its intrinsic importance, but also for providing embeddings with desirable properties in other tasks. To optimize these properties, a wide variety of loss functions have been proposed that attempt to minimize the intra-class distance and maximize the inter-class distance in the embeddings space. In this paper we argue that, in addition to these two, eliminating hierarchies within and among classes are two other desirable properties for classification embeddings. Furthermore, we propose the Angular Distance Distribution (ADD) Loss, which aims to enhance the four previous properties jointly. For this purpose, it imposes conditions on the first and second order statistical moments of the angular distance between embeddings. Finally, we perform experiments showing that our loss function improves all four properties and, consequently, performs better than other loss functions in audio classification tasks.

Paper Structure

This paper contains 14 sections, 8 equations, 1 figure, 2 tables.

Figures (1)

  • Figure 5: Mean (top row) and coefficient of variation (bottom row) of the $d_c$ values between embeddings of 10 classes of ESC-50. The accuracy given is calculated for the 50 classes. Coefficient of variation is defined as $\frac{\sigma}{\mu}$ and is used here instead of $\sigma$ because it normalizes the variation by normalizing by the mean, which changes depending on $\bm{\lambda}$, so it represents better intra-class equidistance.