SecureSpectra: Safeguarding Digital Identity from Deep Fake Threats via Intelligent Signatures
Oguzhan Baser, Kaan Kale, Sandeep P. Chinchali
TL;DR
DeepFake audio threatens voice-verified security and information integrity. The authors propose SecureSpectra, which signs audio with an irreversible, orthogonal high-frequency signature using a private key: $a_i^* = S(a_i, \kappa_i; \theta_\mathcal{S})$, and detects it via a public verifier $\phi$ trained with a joint loss $\mathcal{L}_\mathcal{S}+\mathcal{L}_\phi$, while employing differential privacy on keys to prevent reverse engineering. The method demonstrates up to a 71–81% boost in detection accuracy over baselines and reduces EER across multiple datasets (CV, LibriSpeech, VoxCeleb), with a modest ~4% accuracy loss when DP is enabled. By leveraging DF’s difficulty in reproducing high-frequency content, SecureSpectra provides a model-agnostic, open-source defense for digital voice identity in security-critical contexts such as banking and political communication.
Abstract
Advancements in DeepFake (DF) audio models pose a significant threat to voice authentication systems, leading to unauthorized access and the spread of misinformation. We introduce a defense mechanism, SecureSpectra, addressing DF threats by embedding orthogonal, irreversible signatures within audio. SecureSpectra leverages the inability of DF models to replicate high-frequency content, which we empirically identify across diverse datasets and DF models. Integrating differential privacy into the pipeline protects signatures from reverse engineering and strikes a delicate balance between enhanced security and minimal performance compromises. Our evaluations on Mozilla Common Voice, LibriSpeech, and VoxCeleb datasets showcase SecureSpectra's superior performance, outperforming recent works by up to 71% in detection accuracy. We open-source SecureSpectra to benefit the research community.
