With the rise in popularity of social networks and programs that let users connect instantaneously, communica
tion has become more dynamic. So, regularly occurring new words affect the quality of representation models and
make spelling errors. As the natural language processing applications depend on vector representations of texts,
out-of-vocabulary (OOV) terms are unfamiliar to the models and must be handled with degrading their quality. For
this, we present an OOV handling approach based on four swarm intelligence techniques, ant colony optimization,
chicken swarm optimization, gray wolf optimization, and particle swarm optimization. In this study, three word
embedding models have been used to obtain the representation of words. The performance of the proposed methods
is evaluated on three tasks, dialect identification, sentiment analysis, sarcasm detection, and the results show that
the suggested methods are promising for handling OOV and demonstrated high performance in all experiments.
GWO-OOV-SVM achieved a 53.43% F1-score for dialect identification, while CSO-OOV-SVM achieved 75.66%
and 57.68% F1-scores for sentiment analysis and sarcasm detection respectively, exceeding other models. |