On the application of Visibility Graphs in the Spectral Domain for Speaker Recognition

Hernan Bocaccio; Sergio Iglesias-Pérez; Miguel Romance; Regino Criado; Gabriel B. Mindlin

On the application of Visibility Graphs in the Spectral Domain for Speaker Recognition

Hernan Bocaccio, Sergio Iglesias-Pérez, Miguel Romance, Regino Criado, Gabriel B. Mindlin

TL;DR

This paper addresses speaker recognition from spectral-domain signals by converting LPC-based spectral profiles into visibility graphs to capture topological patterns in speech spectra. It constructs four graph-based features—link density, average shortest path length, clustering coefficient, and modularity—from spectra of five vowels across seven speakers and uses a Random Forest ensemble to classify speaker identity, with SHAP showing modularity as a highly discriminative feature; macro-averaged precision, recall, and F1 on an independent test approach 0.95, while random-label baselines perform near chance. The spectral envelope is modeled by $H(f) = \\frac{d_{0}}{1-\\sum_{k=1}^{m} d_{k} e^{i k 2 \\pi f \\Delta}}$, computed with an LPC order of $m=13$ over 0–5512 Hz with 512 bins, and the method demonstrates robustness to LPC order and threshold choices. The results indicate that spectral-domain topology captures speaker-specific vocal-tract features with robustness to degradation, suggesting practical applicability for biometric systems and expanding the toolbox for speech processing.

Abstract

In this study, we explore the potential of visibility graphs in the spectral domain for speaker recognition. Adult participants were instructed to record vocalizations of the five Spanish vowels. For each vocalization, we computed the frequency spectrum considering the source-filter model of speech production, where formants are shaped by the vocal tract acting as a passive filter with resonant frequencies. Spectral profiles exhibited consistent intra-speaker characteristics, reflecting individual vocal tract anatomies, while showing variation between speakers. We then constructed visibility graphs from these spectral profiles and extracted various graph-theoretic metrics to capture their topological features. These metrics were assembled into feature vectors representing the five vowels for each speaker. Using an ensemble of decision trees trained on these features, we achieved high accuracy in speaker identification. Our analysis identified key topological features that were critical in distinguishing between speakers. This study demonstrates the effectiveness of visibility graphs for spectral analysis and their potential in speaker recognition. We also discuss the robustness of this approach, offering insights into its applicability for real-world speaker recognition systems. This research contributes to expanding the feature extraction toolbox for speaker recognition by leveraging the topological properties of speech signals in the spectral domain.

On the application of Visibility Graphs in the Spectral Domain for Speaker Recognition

TL;DR

Abstract

On the application of Visibility Graphs in the Spectral Domain for Speaker Recognition

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)