You are in:Home/Publications/Machine Learning-Based Approach for Arabic Dialect Identification

Dr. Hamada Ali Mohamed Ali Nayel :: Publications:

Title:
Machine Learning-Based Approach for Arabic Dialect Identification
Authors: Mahmoud S. Ali;Ahmed H. Ali;Ahmed A. El-Sawy;Hamada A. Nayel
Year: 2021
Keywords: Arabic Dialect Identification; Arabic NLP
Journal: Proceedings of the Sixth Arabic Natural Language Processing Workshop
Volume: Not Available
Issue: Not Available
Pages: Not Available
Publisher: Association for Computational Linguistics
Local/International: International
Paper Link:
Full paper Not Available
Supplementary materials Not Available
Abstract:

This paper describes our systems submitted to the Second Nuanced Arabic Dialect Identification Shared Task (NADI 2021). Dialect identification is the task of automatically detecting the source variety of a given text or speech segment. There are four subtasks, two subtasks for country-level identification and the other two subtasks for province-level identification. The data in this task covers a total of 100 provinces from all 21 Arab countries and come from the Twitter domain. The proposed systems depend on five machine-learning approaches namely Complement Naïve Bayes, Support Vector Machine, Decision Tree, Logistic Regression and Random Forest Classifiers. F1 macro-averaged score of Naïve Bayes classifier outperformed all other classifiers for development and test data.

Google ScholarAcdemia.eduResearch GateLinkedinFacebookTwitterGoogle PlusYoutubeWordpressInstagramMendeleyZoteroEvernoteORCIDScopus