Scientific text continues to increase rapidly as compared to the past. In many fields the same word may have various meanings. Machine translation is a basic problem in natural language processing that is used in both general and specialized fields, including scientific, medical, literary, and commercial translation. Domain-specific translation necessitates subject-matter expertise in addition to linguistic proficiency in both the source and target languages. The lack of high-quality resources is one of the biggest obstacles to scientific translation, especially for Arabic. Arabic scientific text translation is a crucial task that is hindered by a lack of resources. This paper introduces an Arabic-English dataset for scientific text, collected from the titles of the theses which are available on different sources. To the best of our knowledge, it is the first dataset available for this purpose.
|