You are in:Home/Publications/Topic Modelling with Bag-of-concepts Document Representation

Assist. Ibrahim Reyad Ibrahim El-sayed El-fawal :: Publications:

Title:
Topic Modelling with Bag-of-concepts Document Representation
Authors: Metwally Rashad; Ibrahim Reyad; Mohamed Abdelfatah
Year: 2025
Keywords: Latent Dirichlet Allocation, Topic Modeling, Document Representation
Journal: 2022 4th Novel Intelligent and Leading Emerging Sciences Conference (NILES), Giza, Egypt, 2022
Volume: pp. 216-220
Issue: Not Available
Pages: Not Available
Publisher: IEEE
Local/International: International
Paper Link:
Full paper Ibrahim Reyad Ibrahim El-sayed El-fawal_2022186554.pdf
Supplementary materials Not Available
Abstract:

Traditionally, text mining tasks have been implemented by applying topic models like Latent Dirichlet Allocation (LDA). These topic models occasionally produce noisy words in illogical topics with a high probability. The problem is that topic model-based approaches are sparse, have binary weighting for terms, and lack semantic data. The topic model technique is combined with a document representation technique called Bag-of-Concepts to solve these problems. The bag-of-concepts approach groups word vectors from word2vec to create concepts, which are subsequently represented in document vectors by these concept cluster occurrences. The performance of document proximity preservation is taken into account by Bag-of-concepts when using the suitable weighting formula concept frequency-inverse document frequency. Latent Dirichlet Allocation is adjusted for use in document clustering and quality tasks for topics. The results are compared with different LDA frameworks on text documents, as well as the bag-of-concepts representation of documents. LDA with Bag-of-concepts representation generates more cohesive themes in comparison to the other techniques.

Google ScholarAcdemia.eduResearch GateLinkedinFacebookTwitterGoogle PlusYoutubeWordpressInstagramMendeleyZoteroEvernoteORCIDScopus