Table of Contents
Fetching ...

Audio-based Kinship Verification Using Age Domain Conversion

Qiyang Sun, Alican Akman, Xin Jing, Manuel Milling, Björn W. Schuller

TL;DR

The notion of an “age-standardised domain” wherein the optimised CycleGAN-VC3 network is utilised to perform age-audio conversion to generate the in-domain audio is designed, which markedly enhances the accuracy of kinship verification, while also offering novel insights for future kinship verification research.

Abstract

Audio-based kinship verification (AKV) is important in many domains, such as home security monitoring, forensic identification, and social network analysis. A key challenge in the task arises from differences in age across samples from different individuals, which can be interpreted as a domain bias in a cross-domain verification task. To address this issue, we design the notion of an "age-standardised domain" wherein we utilise the optimised CycleGAN-VC3 network to perform age-audio conversion to generate the in-domain audio. The generated audio dataset is employed to extract a range of features, which are then fed into a metric learning architecture to verify kinship. Experiments are conducted on the KAN_AV audio dataset, which contains age and kinship labels. The results demonstrate that the method markedly enhances the accuracy of kinship verification, while also offering novel insights for future kinship verification research.

Audio-based Kinship Verification Using Age Domain Conversion

TL;DR

The notion of an “age-standardised domain” wherein the optimised CycleGAN-VC3 network is utilised to perform age-audio conversion to generate the in-domain audio is designed, which markedly enhances the accuracy of kinship verification, while also offering novel insights for future kinship verification research.

Abstract

Audio-based kinship verification (AKV) is important in many domains, such as home security monitoring, forensic identification, and social network analysis. A key challenge in the task arises from differences in age across samples from different individuals, which can be interpreted as a domain bias in a cross-domain verification task. To address this issue, we design the notion of an "age-standardised domain" wherein we utilise the optimised CycleGAN-VC3 network to perform age-audio conversion to generate the in-domain audio. The generated audio dataset is employed to extract a range of features, which are then fed into a metric learning architecture to verify kinship. Experiments are conducted on the KAN_AV audio dataset, which contains age and kinship labels. The results demonstrate that the method markedly enhances the accuracy of kinship verification, while also offering novel insights for future kinship verification research.

Paper Structure

This paper contains 11 sections, 1 equation, 2 figures, 1 table.

Figures (2)

  • Figure 1: Framework of kinship verification using age voice conversion.
  • Figure 2: t-SNE distributions of different features from original and generated datasets.