Ahmed Hassan Ahmed Abu El Atta|Publications:Two-class support vector machine with new kernel function based on paths of features for predicting chemical activity

You are in:Home/Publications/Two-class support vector machine with new kernel function based on paths of features for predicting chemical activity
Dr. Ahmed Hassan Ahmed Abu El Atta :: Publications:

Title:	Two-class support vector machine with new kernel function based on paths of features for predicting chemical activity
Authors:	Ahmed H Abu El-Atta, Aboul Ella Hassanien
Year:	2017
Keywords:	Chemoinformatics Graph kernel Paths of features Relationship between features Activity prediction
Journal:	Information Sciences
Volume:	403-404
Issue:	Not Available
Pages:	42-54
Publisher:	Elsevier
Local/International:	International
Paper Link:	https://www.sciencedirect.com/science/article/pii/S0020025517306448
Full paper	Not Available
Supplementary materials	Not Available

Abstract:

Information and computer science fields such as machine learning and graph theory are implemented in chemoinformatics to discover the properties of chemical compounds. This paper presents a new algorithm based on the two-class support vector machine (SVM) model, which has new kernel functions for paths of features, enabling the prediction of chemical compound activity. Initially, we extract all paths of features (star subgraphs) with certain lengths, and we encode them depending on their structure in the graphs. Then, we use these codes to construct two relationship matrices between those paths. These matrices contain common and different sub-paths between paths of stars. The number of sub-paths/paths for each compound is passed to the proposed kernel functions in the two-class SVM to predict the activity of chemical compounds. The relationship matrices created by the proposed algorithm help to reduce the number of features, which improves prediction accuracy. We apply the proposed algorithm with and without feature selection using two benchmark datasets, specifically, the monoamine oxidase (MAO) dataset and the AIDS antiviral screen database of active compound dataset, which have 68 and 2000 chemical compounds, respectively. We perform comparative experiments for the proposed kernel functions and many other two-class SVM prediction methods, and the results before feature selection show prediction accuracies of 94% and 99.5% for MAO and AIDS, respectively. After selection, the prediction accuracies are 96% and 99.5% for MAO and AIDS, respectively.

Dr. Ahmed Hassan Ahmed Abu El Atta :: Publications: