You are in:Home/Publications/An Efficient Speaker Diarization Pipeline for Conversational Speech

Dr. Mourad Samir abdallah Mohmmed Semary :: Publications:

Title:
An Efficient Speaker Diarization Pipeline for Conversational Speech
Authors: Wael Ali Sultan, Mourad Samir Semary, Sherif Mahdy Abdou
Year: 2024
Keywords: Not Available
Journal: Benha Journal of Applied Sciences
Volume: Not Available
Issue: Not Available
Pages: Not Available
Publisher: Not Available
Local/International: Local
Paper Link: Not Available
Full paper Not Available
Supplementary materials Not Available
Abstract:

In the domain of audio signal processing, the accurate and efficient diarization of conversational speech remains a challenging task, particularly in environments with significant speaker overlap and diverse acoustic scenarios. This paper introduces a comprehensive speaker diarization pipeline that substantially improves both performance and efficiency in processing conversational speech. Our pipeline comprises several key components: Voice Activity Detection (VAD), Speaker Overlap Detection (SOD), Speaker Separation models, robust speaker embedding, clustering algorithms, and sophisticated post-processing techniques. Beginning with Voice Activity Detection (VAD), the pipeline efficiently discriminates between speech and non-speech segments, effectively reducing processing overhead. Following VAD, the Speaker Overlap Detection (SOD) component identifies segments featuring speaker overlap.

Google ScholarAcdemia.eduResearch GateLinkedinFacebookTwitterGoogle PlusYoutubeWordpressInstagramMendeleyZoteroEvernoteORCIDScopus