You are in:Home/Publications/Improving NER for Clinical Texts by Ensemble Approach using Segment Representations

Dr. Hamada Ali Mohamed Ali Nayel :: Publications:

Title:
Improving NER for Clinical Texts by Ensemble Approach using Segment Representations
Authors: Hamada A. Nayel; Shashirekha, H L
Year: 2017
Keywords: Clinical NLP; Ensemble Learning; Named Entity Recognition
Journal: Proceedings of the 14th International Conference on Natural Language Processing (ICON-2017)
Volume: 2017
Issue: Not Available
Pages: 197-204
Publisher: NLP Association of India
Local/International: International
Paper Link:
Full paper Hamada Ali Mohamed Ali Nayel_W17-7525.pdf
Supplementary materials Not Available
Abstract:

Clinical Named Entity Recognition (Clinical-NER), which aims at identifying and classifying clinical named entities into predefined categories, is a critical pre-processing task in health information systems. Different machine learning approaches have been used to extract and classify clinical named entities. Each approach has its own strength as well as weakness when considered individually. Ensemble technique uses the strength of one approach to overcome the weakness of another approach by combining the outputs of different classifiers in order to make the decision thereby improving the results. Segment representation is a technique that is used to add a tag for each token in a given text. In this paper, we propose an ensemble approach to combine the outputs of four different base classifiers in two different ways, namely, majority voting and stacking. We have used support vector machines to train the base classifiers with different segment representation models namely IOB2, IOE2, IOBE and IOBES. The proposed algorithm is evaluated on a well-known clinical dataset i2b2 2010 corpus and results obtained illustrate that the proposed approach outperforms the performance of each of the base classifiers.

Google ScholarAcdemia.eduResearch GateLinkedinFacebookTwitterGoogle PlusYoutubeWordpressInstagramMendeleyZoteroEvernoteORCIDScopus