Voice Clone Guard: Few-Shot Deepfake Detection with XLS-R Embeddings and Gaussian Process Calibration

Voice Clone Guard Few-Shot Deepfake Detection Benjamin J Gilbert College of the Mainland Robotic Process Automation Download

We present Voice Clone Guard, a few-shot voice
deepfake detector that combines XLS-R embeddings with Gaussian Process (GP) classification for superior detection accuracy and probability calibration. Our approach leverages selfsupervised speech representations from Wav2Vec2-XLS-R and
employs GP inference to provide well-calibrated uncertainty
estimates. Evaluation across synthetic and real-world datasets
demonstrates significant improvements: 95.6% AUC (vs 78.2%
MFCC baseline), 4.2% Equal Error Rate (vs 18.5%), and
substantially better calibration with Expected Calibration Error
of 0.032 (vs 0.127). The method excels in few-shot scenarios,
achieving 85.2% accuracy with only 4 examples per class, making
it practical for deployment with limited training data.
Index Terms—Voice deepfake detection, few-shot learning,
XLS-R embeddings, Gaussian processes, probability calibration,
speech forensics

Facebook

[PDF] WESPER: Zero-shot and Realtime Whisper to Normal Voice … – arXiv

It achieves user-independent voice conversion in real time. ABSTRACT. Recognizing whispered speech and converting it to normal speech creates …

arxiv.org

A Survey of Context Engineering for Large Language Models – arXiv

This survey introduces Context Engineering, a formal discipline that transcends simple prompt design to encompass the systematic optimization of information …

arxiv.org

The Relational Origins of Rules in Online Communities – arXiv

One of the original questions motivating our study was “Why do so many online communities have similar rules?” In the previous section, we …

arxiv.org

[PDF] arXiv:2402.01662v4 [cs.CY] 12 Dec 2024

Another key expansion to the notion of griefbots is that generative ghosts might exist pre-mortem as a generative clone (an AI agent …