Data are rarely perfect. Whether the problem is data entry errors or rare events. Outliers have
two opposing properties. They can be noises that disturb regression and classification task. On
the other hand, they can provide valuable information about rare phenomena, which can lead to
knowledge discovery. This paper proposes a hybrid algorithm including K Nearest Neighbor
and Support Vector Machine (KSVM) that detects outliers by taking the advantages of the two
intelligent techniques, Support Vector Machine (SVM) and K Nearest Neighbour (KNN). Also
a global efficiency measure introduced to compare different methods. Finally, a comparison
between KNN, SVM, and KSVM is conducted using detection rate, accuracy rate, false alarm
rate, true negative rate and the proposed global efficiency measure based on benchmark data
called Milk data. |