بهبود سرعت سیستم تشخیص نفوذ از طریق کاهش حجم داده‌ها با استفاده از DBSCAN مبتنی بر هسته

نوع مقاله : مقاله پژوهشی

نویسندگان

1 دانشیار، دانشگاه یزد، یزد،ایران

2 کارشناسی ارشد،دانشگاه یزد،یزد،ایران

3 استادیار، دانشگاه یزد،یزد،ایران

چکیده

اینترنت اشیاء یک فناوری به‌سرعت در حال تکامل است که دستگاه‌های فیزیکی را از طریق سیستم‌های شبکه‌ای به هم متصل می‌کند. بااین‌حال، همان‌طور که اینترنت اشیاء به گسترش خود ادامه می‌دهد، چالش‌های امنیتی مختلفی را ایجاد می‌کند که نیازمند راه‌حل‌های مناسب برای محافظت از اطلاعات حساس و حریم خصوصی کاربران است. این مقاله بر روی بهبود سرعت سیستم تشخیص نفوذ به‌عنوان یک راه‌حل حیاتی برای امنیت اینترنت اشیاء تمرکز دارد. در سیستم‌های تشخیص نفوذ، وجود حجم زیاد داده موجب کاهش سرعت یادگیری می‌شود. در این مقاله، الگوریتم خوشه‌بندی DBSCAN با افزودن پارامتر حداقل همسایگی جهت کاهش هدفمند نمونه‌ها اصلاح شده است، که سعی در افزایش سرعت سیستم تشخیص نفوذ و کاهش زمان و هزینه یادگیری دارد. تنظیم پارامترهای DBSCAN اصلاح‌شده با الگوریتم ژنتیک انجام می‌شود. نتایج آزمایش‌ها بر روی مجموعه‌داده Kaggle و NSL_KDD نشان می‌دهد که مدل پیشنهادی قادر است با کاهش تا 80٪ از حجم داده‌ها، دقت طبقه‌بندی را برای مجموعه‌داده Kaggle بالای 96٪ و برای مجموعه‌داده NSL_KDD بالای 51/92٪ حفظ نماید. همچنین، زمان محاسبات برای مجموعه‌داده Kaggle از ms09/458 بهms 21/47 و برای مجموعه‌داده NSL_KDD ازms 2/995 بهms 60/223، کاهش‌یافته است. به این ترتیب، با وجود بهبود در سرعت و کاهش زمان و هزینه، عملکرد مطلوب مدل حفظ شده است.

کلیدواژه‌ها

موضوعات


عنوان مقاله [English]

Improving the speed of the intrusion detection system performance by reducing the data volume using kernel-based DBSCAN

نویسندگان [English]

  • Seyed Abolfazl Shahzadeh Fazeli 1
  • Azam Ghoveh Nodoushan 2
  • Jamal Zarepour-Ahmadabadi 3
1 Associate Professor, Yazd University, Yazd, Iran
2 Master's degree, Yazd University, Yazd, Iran
3 Assistant Professor, Yazd University, Yazd, Iran
چکیده [English]

The Internet of Things (IoT) is a rapidly evolving technology that connects physical devices through networked systems. However, as IoT continues to expand, it poses various security challenges that require appropriate solutions to protect sensitive information and user privacy. This paper focuses on improving the speed of intrusion detection systems (IDS) as a critical solution for IoT security. In IDS, the large volume of data can slow down the learning process. In this paper, the DBSCAN clustering algorithm is modified by adding a minimum neighborhood parameter to reduce data samples in a targeted manner, aiming to enhance the speed of IDS and reduce learning time and costs. The parameters of the modified DBSCAN are tuned using a genetic algorithm. Experimental results on the Kaggle and NSL_KDD datasets demonstrate that the proposed model can maintain classification accuracy above 96% for the Kaggle dataset and above 92.51% for the NSL_KDD dataset, even with up to an 80% reduction in data volume. Additionally, computation time for the Kaggle dataset decreased from 458.09 ms to 47.21 ms, and for the NSL_KDD dataset from 995.2 ms to 223.60 ms. Thus, despite improvements in speed and reductions in time and cost, the model's optimal performance is maintained.

کلیدواژه‌ها [English]

  • Internet of Things
  • Intrusion detection system
  • Clustering
  • Classification and regression tree algorithm
  • DBSCAN
  • Data reduction
  • RN_DBSCAN
  • Genetic algorithm
  • Kaggle dataset
  • NSL_KDD dataset

Smiley face

 

[1] E. A. Shammar and A. T. Zahary, "The Internet of Things (IoT): a survey of techniques, operating systems, and trends," Library Hi Tech, vol. 38, no. 1, pp. 5-66, 2020.
[3] Y. Zhang, Y. Zhang, T. Chen, and B. Xia, "Internet of Things (IoT) security: A survey," Journal of Information Security and Applications, vol. 50, p. 102419, 2020.
[5] W. Zhang and S. Li, "A Deep Learning Approach for Intrusion Detection System," IEEE Access, vol. 9, pp. 35470-35479, 2021.
[7] I. Ahmed, M. Mahfuzul Islam, and A. A. Adewole, "A survey of intrusion detection techniques in cloud computing," Journal of Network and Computer Applications, vol. 36, no. 1, pp. 42-57, 2013.
[8] M. S. Farash and S. Samet, "Feature Selection for Intrusion Detection Systems: A Comprehensive Review," Computer Networks, vol. 74, pp. 443-460, 2014.
[9] A. A. Wiharto and U. Permana, "Improvement of performance intrusion detection system (IDS) using artificial neural network ensemble," Journal of Theoretical and Applied Information Technology, vol. 80, no. 2, pp. 191-201, 2015.
[10] D. Uhm, S. H. Jun, and S. J. Lee, "A classification method using data reduction," International Journal of Fuzzy Logic and Intelligent Systems, vol. 12, no. 1, pp. 1-5, 2012.
[11] N. A. Le-Khac, M. Bue, M. Whelan, and M. T. Kechadi, "A clustering-based data reduction for very large spatiotemporal datasets," in Advanced Data Mining and Applications, 2010, pp. 43-54.
[12] J. Wang, S. Yue, X. Yu, and Y. Wang, "An efficient data reduction method and its application to cluster analysis," Neurocomputing, vol. 238, pp. 234-244, 2017.
[13] A. C. Benabdellah, A. Benghabrit, and I. Bouhaddou, "A survey of clustering algorithms for an industrial context," Procedia Computer Science, vol. 148, pp. 291-302, 2019.
[14] J. M. Dudik, A. Kurosu, J. L. Coyle, and E. Sejdic, "A comparative analysis of DBSCAN, K-means, and quadratic variation algorithms for automatic identification of swallows from swallowing accelerometry signals," Computers in Biology and Medicine, vol. 59, pp. 10-18, 2015.
[15] S. Ougiaroglou and G. Evangelidis, "Efficient dataset size reduction by finding homogeneous clusters," in Proceedings of the 5th Balkan Conference in Informatics, 2012, pp. 168-173.
[16] S. Ougiaroglou and G. Evangelidis, "RHC: a nonparametric cluster-based data reduction for efficient k-NN classification," Pattern Analysis and Applications, vol. 19, no. 1, pp. 93-109, 2016.
[17] S. Ougiaroglou, K. I. Diamantaras, and G. Evangelidis, "Exploring the effect of data reduction on neural network and support vector machine classification," Neurocomputing, vol. 208, pp. 101-110, 2018.
[18] S. Ougiaroglou and G. Evangelidis, "Efficient editing and data abstraction by finding homogeneous clusters," Annals of Mathematics and Artificial Intelligence, vol. 76, no. 3-4, pp. 327-349, 2016.
[19] A. K. Wicaksana and D. E. Cahyani, "Modification of a density-based spatial clustering algorithm for applications with noise for data reduction in intrusion detection systems," International Journal of Fuzzy Logic and Intelligent Systems, vol. 21, no. 2, pp. 189-203, 2021.
[20] F. O. Ozkok and M. Celik, "A new approach to determine Eps parameter of DBSCAN algorithm," International Journal of Intelligent Systems and Applications Engineering, vol. 5, no. 4, pp. 247-251, 2017.
[21] M. Ester, H.-P. Kriegel, J. Sander, and X. Xu, "A density-based algorithm for discovering clusters in large spatial databases with noise," in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 1996, pp. 226-231.
[22] M. Zbigniew, "Genetic algorithms + data structures = evolution programs," Comput Stat, 1996.
[23] L. Rutkowskia, M. Jaworskia, L. Pietruczuka, and P. Duda, "The CART Decision Tree for Mining Data Streams," Information Sciences, vol. 266, pp. 1-15, 2014.
[24] R. A. Johnson and D. W. Wichern, Applied Multivariate Statistical Analysis. Upper Saddle River, NJ: Pearson Prentice Hall, 2007.
[25] P. I. Radoglou-Grammatikis and P. G. Sarigiannidis, "An anomaly-based intrusion detection system for the smart grid based on CART decision tree," in Proceedings of the Global Information Infrastructure and Networking Symposium (GIIS), 2018.
[26] H. H. Patel and P. Prajapati, "Study and analysis of decision tree based classification algorithms," International Journal of Computer Sciences and Engineering, vol. 6, no. 10, pp. 74-78, 2018.
[27] A. Priyam, G. R. Abhijeeta, A. Rathee, and S. Srivastava, "Comparative analysis of decision tree classification algorithms," International Journal of Current Engineering and Technology, vol. 3, no. 2, pp. 334-337, 2013.
[28] H. M. Sani, C. Lei, and D. Neagu, "Computational complexity analysis of decision tree algorithms," in Proceedings of the Artificial Intelligence XXXV, 2018, pp. 191-197.
[29] A. Zadedehbalaei, A. Bagheri, and H. Afshar, "A study on DBSCAN Clustering algorithm issues and a survey on its improvements," Soft Computing Journal, vol. 6, 2021, pp. 2322-3707.
[30] S. Bhattacharya, P.K.R. Maddikunta, R. Kaluri, S. Singh, T.R. Gadekallu, M. Alazab, U. Tariq, "A novel PCA-firefly based XGBoost classification model for intrusion detection in networks using GPU," Electronics, vol. 9, no. 2, 2020.
[31] I.H. Sarker, Y.B. Abushark, F. Alsolami, A.I. Khan, "IntruDTree: a machine learning based cyber security intrusion detection model," Symmetry, vol. 12, no. 5, 2020.
[32] M.B. Shahbaz, X. Wang, A. Behnad, J. Samarabandu, "On efficiency enhancement of the correlation-based feature selection for intrusion detection systems," in Proceedings of the 2016 IEEE 7th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), Vancouver, BC, Canada, 2016, pp. 1–7.
[33] M. Abdullah, A. Balamash, A. Alshannaq, S. Almabdy, "Enhanced Intrusion Detection System using Feature Selection Method and Ensemble Learning Algorithms," International Journal of Computer Science and Information Security (IJCSIS), vol. 16, no. 2, 2018, pp. 48–55.
[34] H. Alazzam, A. Sharieh, K.E. Sabri, "A feature selection algorithm for intrusion detection system based on Pigeon Inspired Optimizer," Expert Systems with Applications, vol. 148, 2020, 113249.
[35] S. M. Kasongo, "A deep learning technique for intrusion detection system using a recurrent neural networks based framework," Computer Communications, vol. 199, no. 1, pp. 113–125, 2023