Recently, authorship forensic analysis for political articles has become very important. It is the process in which a linguist attempts to identify the author of an anonymous text based on the vocabulary used and the linguistic style of the writer. The most existing studies of authorship forensic analysis focus on the English language, while researches concerning the Arabic language is rare. In this research, we present a new methodology that enhances authorship forensic analysis focusing on the Arabic language. The basic idea is to extract the unique vocabulary terms identifying the author (or a political group) and used for recognition of unknown authors. In the current work, a Term Frequency-Inverse Group Frequency (TFIGF) is proposed, which is a modification of the traditional TF-IDF method. Our approach is tested with large political dataset and determine the performance of Authorship forensic analysis method based on vocabulary words. The experimental results show that the average accuracy for recognizing groups has increased from 89.33% when using TF-IDF, to 92% with the proposed TF-IGF. Further improvement is achieved when representing the vocabulary terms in its Arabic lemma form, rather than its root form. The results show that the accuracy is improved from 89.33% to 92%. |