This work presents a new alignment word-space approach for measuring the similarity between
two snipped texts. The approach combines two similarity measurement methods: alignment-based
and vector space-based. The vector space-based method depends on a semantic net that represents
the meaning of words as vectors. These vectors are lemmatized to enrich the search space. The
alignment-based method generates an alignment word space matrix (AWSM) for the snipped texts
according to the generated semantic word spaces. Finally, the degree of sentence semantic similarity
is measured using some proposed alignment rules. Four experiments were carried out to evaluate
the performance of the proposed approach, using two different datasets. The experimental results
proved that applying the lemmatization process for the input text and the vector model has a better
effect. The degree of correctness of the results reaches 0.7212, which is considered one of the best
two results of the published Arabic semantic similarities. |