Voice Clone Guard: Few-Shot Deepfake Detection with XLS-R Embeddings and Gaussian Process Calibration
We present Voice Clone Guard, a few-shot voice
deepfake detector that combines XLS-R embeddings with Gaussian Process (GP) classification for superior detection accuracy and probability calibration. Our approach leverages selfsupervised speech representations from Wav2Vec2-XLS-R and
employs GP inference to provide well-calibrated uncertainty
estimates. Evaluation across synthetic and real-world datasets
demonstrates significant improvements: 95.6% AUC (vs 78.2%
MFCC baseline), 4.2% Equal Error Rate (vs 18.5%), and
substantially better calibration with Expected Calibration Error
of 0.032 (vs 0.127). The method excels in few-shot scenarios,
achieving 85.2% accuracy with only 4 examples per class, making
it practical for deployment with limited training data.
Index Terms—Voice deepfake detection, few-shot learning,
XLS-R embeddings, Gaussian processes, probability calibration,
speech forensics
Facebook
[PDF] WESPER: Zero-shot and Realtime Whisper to Normal Voice … – arXiv
It achieves user-independent voice conversion in real time. ABSTRACT. Recognizing whispered speech and converting it to normal speech creates …
arxiv.org
A Survey of Context Engineering for Large Language Models – arXiv
This survey introduces Context Engineering, a formal discipline that transcends simple prompt design to encompass the systematic optimization of information …
arxiv.org
The Relational Origins of Rules in Online Communities – arXiv
One of the original questions motivating our study was “Why do so many online communities have similar rules?” In the previous section, we …
arxiv.org
[PDF] arXiv:2402.01662v4 [cs.CY] 12 Dec 2024
Another key expansion to the notion of griefbots is that generative ghosts might exist pre-mortem as a generative clone (an AI agent …
arxiv.org
Understanding Content Moderation Policies and User Experiences …
AI and security researchers use two automated methods to ensure the safety of GAI model output: internal model fine-tuning and external content …
arxiv.org
AI-generated audio deepfakes are increasing. We tested 4 tools …
We tested four free online tools that claim to determine whether an audio clip is AI-generated. Only one of them signaled that the Biden-like robocall was …
politifact.com
Trustworthy-AI-Group/Adversarial_Examples_Papers: A list … – GitHub
A complete list of papers about adversarial examples. It appears that the List of All Adversarial Example Papers has been experiencing crashes over the past …
github.com
[PDF] Decl. ISO Patent Owner’s Mot. to Amend Case No. IPR2023-00693
Detecting deepfake audio through linguistic information. … Benjamin Mood, Debayan Gupta, Henry … The State of Voice Cloning Technology. Federal …
ptacts.uspto.gov
[PDF] ABSTRACT …………………………………………………….. – SSRN
that deepfake porn constituted 98 percent of all deepfake videos … Benjamin S. Sheffner, Senior … from the sound of the voice or simulation of the voice,.
papers.ssrn.com
ORCID
ORCID Please enable JavaScript to continue using this application.
orcid.org
CorentinJ/Real-Time-Voice-Cloning – GitHub
SV2TTS is a deep learning framework in three stages. In the first stage, one creates a digital representation of a voice from a few seconds of audio. In the …
github.com
Best Open Source Voice Cloning if you have lots of reference audio?
I’ve been using ElevenLabs for awhile but now want to self-host. I was really impressed with F5-TTS for its ability to clone using only a few seconds of audio.
reddit.com
xlsr · GitHub Topics
The official implementation of the method discussed in the paper Improving Spoken Language Identification with Map-Mix(work accepted at ICASSP-2023).
github.com
GitHub – cvlab-kaist/GaussianTalker
This is our official implementation of the paper “GaussianTalker: Real-Time High-Fidelity Talking Head Synthesis with Audio-Driven 3D Gaussian Splatting”
github.com
voice-clone · GitHub Topics
Transform your speech into your favorite celebrity’s or your customized voice. Our cutting-edge tool converts text or any audio into your desired voice.
github.com