You are in:Home/Publications/M. H. Habeb, M. Salama and L. A. Elrefaei, "An Attention-Based Unsupervised Framework for Video Anomaly Detection in Large Heterogeneous Environments," 2024 International Mobile, Intelligent, and Ubiquitous Computing Conference (MIUCC), Cairo, Egypt, 2024, pp. 113-120, doi: 10.1109/MIUCC62295.2024.10783549.

Prof. Lamiaa Abdallah Ahmed Elrefaei :: Publications:

Title:
M. H. Habeb, M. Salama and L. A. Elrefaei, "An Attention-Based Unsupervised Framework for Video Anomaly Detection in Large Heterogeneous Environments," 2024 International Mobile, Intelligent, and Ubiquitous Computing Conference (MIUCC), Cairo, Egypt, 2024, pp. 113-120, doi: 10.1109/MIUCC62295.2024.10783549.
Authors: M. H. Habeb, M. Salama and L. A. Elrefaei
Year: 2024
Keywords: Not Available
Journal: Not Available
Volume: Not Available
Issue: Not Available
Pages: Not Available
Publisher: IEEE
Local/International: International
Paper Link:
Full paper Not Available
Supplementary materials Not Available
Abstract:

Although automated detection capabilities are frequently included in modern video anomaly detection systems, these algorithms may still encounter issues including high false-positive rates and trouble comprehending complicated circumstances. Dealing with big heterogeneous datasets that are taken in various lighting and environmental circumstances with changing resolution and quality results in obstacles for video anomaly detection systems. This paper proposes an end-to-end vision transformer (ViT)-based framework for video anomaly detection in surveillance systems. The proposed framework incorporates the ViT model with a spatio-temporal attention model augmented with Long Short-Term Memory (LSTM) layers to extract both the global context and the spatio-temporal features of the video frames. The framework is mainly designed to work with large and heterogeneous datasets that represent different environments, which remains challenging for the current video anomaly detection efforts. To assess the proposed framework, video anomaly benchmark datasets are used: the big ShanghaiTech dataset, the CUHK Avenue dataset, and the VCSD-Ped2 dataset. The proposed framework achieved AUC_ROC of 79.6%, 83.4%, and 93.8% for the three datasets, respectively. This illustrates the superior performance of the proposed framework in identifying video abnormalities compared to SOTA, especially for large heterogeneous datasets, i.e., ShanghaiTech.

Google ScholarAcdemia.eduResearch GateLinkedinFacebookTwitterGoogle PlusYoutubeWordpressInstagramMendeleyZoteroEvernoteORCIDScopus