Table of Contents
Fetching ...

Noise-to-mask Ratio Loss for Deep Neural Network based Audio Watermarking

Martin Moritz, Toni Olán, Tuomas Virtanen

TL;DR

Both objective and subjective tests show that models trained with NMR loss generate more transparent watermarks than models trained with the conventionally used MSE loss.

Abstract

Digital audio watermarking consists in inserting a message into audio signals in a transparent way and can be used to allow automatic recognition of audio material and management of the copyrights. We propose a perceptual loss function to be used in deep neural network based audio watermarking systems. The loss is based on the noise-to-mask ratio (NMR), which is a model of the psychoacoustic masking effect characteristic of the human ear. We use the NMR loss between marked and host signals to train the deep neural models and we evaluate the objective quality with PEAQ and the subjective quality with a MUSHRA test. Both objective and subjective tests show that models trained with NMR loss generate more transparent watermarks than models trained with the conventionally used MSE loss

Noise-to-mask Ratio Loss for Deep Neural Network based Audio Watermarking

TL;DR

Both objective and subjective tests show that models trained with NMR loss generate more transparent watermarks than models trained with the conventionally used MSE loss.

Abstract

Digital audio watermarking consists in inserting a message into audio signals in a transparent way and can be used to allow automatic recognition of audio material and management of the copyrights. We propose a perceptual loss function to be used in deep neural network based audio watermarking systems. The loss is based on the noise-to-mask ratio (NMR), which is a model of the psychoacoustic masking effect characteristic of the human ear. We use the NMR loss between marked and host signals to train the deep neural models and we evaluate the objective quality with PEAQ and the subjective quality with a MUSHRA test. Both objective and subjective tests show that models trained with NMR loss generate more transparent watermarks than models trained with the conventionally used MSE loss
Paper Structure (11 sections, 3 equations, 8 figures, 2 tables)

This paper contains 11 sections, 3 equations, 8 figures, 2 tables.

Figures (8)

  • Figure 1: Embedder
  • Figure 2: Extractor
  • Figure 4: The computation steps of the
  • Figure 5: Pitch and masking patterns for a host signal, and the noise patterns for marked signals generated by two (NMR256 and MSE256) of our models
  • Figure 6: The embedder network
  • ...and 3 more figures