The paralinguistic information in a speech signal includes clues to the ethnic and social background of the speaker. In this paper, we propose a hybrid approach to dialect and accent recognition from spoken Arabic language, based on phonotactic and spectral systems separately then combining both by decision fusion technique. We extract speech attribute features that represent acoustic cues of different speaker’s dialect to obtain feature streams that are modeled with the Gaussian Mixture Model with Universal background model (GMM-UBM) in addition to Identity Vector (I-vector) classifier. Moreover, this paper introduces our proposed dataset SARA, which is a Modern Colloquial Arabic dataset (MCA) contains three different Arabic dialects and its common accents, this dataset will be the master dataset for this work. We find our proposed technique with acoustic features achieves a significant performance improvement over the state-of the-art systems using Arabic dialects in the dataset. |