Improving the speed of the intrusion detection system performance by reducing the data volume using kernel-based DBSCAN

Document Type : Original Article

Authors

1 Associate Professor, Yazd University, Yazd, Iran

2 Master's degree, Yazd University, Yazd, Iran

3 Assistant Professor, Yazd University, Yazd, Iran

Abstract

The Internet of Things (IoT) is a rapidly evolving technology that connects physical devices through networked systems. However, as IoT continues to expand, it poses various security challenges that require appropriate solutions to protect sensitive information and user privacy. This paper focuses on improving the speed of intrusion detection systems (IDS) as a critical solution for IoT security. In IDS, the large volume of data can slow down the learning process. In this paper, the DBSCAN clustering algorithm is modified by adding a minimum neighborhood parameter to reduce data samples in a targeted manner, aiming to enhance the speed of IDS and reduce learning time and costs. The parameters of the modified DBSCAN are tuned using a genetic algorithm. Experimental results on the Kaggle and NSL_KDD datasets demonstrate that the proposed model can maintain classification accuracy above 96% for the Kaggle dataset and above 92.51% for the NSL_KDD dataset, even with up to an 80% reduction in data volume. Additionally, computation time for the Kaggle dataset decreased from 458.09 ms to 47.21 ms, and for the NSL_KDD dataset from 995.2 ms to 223.60 ms. Thus, despite improvements in speed and reductions in time and cost, the model's optimal performance is maintained.

Keywords

Main Subjects


Smiley face

 

[1] E. A. Shammar and A. T. Zahary, "The Internet of Things (IoT): a survey of techniques, operating systems, and trends," Library Hi Tech, vol. 38, no. 1, pp. 5-66, 2020.
[3] Y. Zhang, Y. Zhang, T. Chen, and B. Xia, "Internet of Things (IoT) security: A survey," Journal of Information Security and Applications, vol. 50, p. 102419, 2020.
[5] W. Zhang and S. Li, "A Deep Learning Approach for Intrusion Detection System," IEEE Access, vol. 9, pp. 35470-35479, 2021.
[7] I. Ahmed, M. Mahfuzul Islam, and A. A. Adewole, "A survey of intrusion detection techniques in cloud computing," Journal of Network and Computer Applications, vol. 36, no. 1, pp. 42-57, 2013.
[8] M. S. Farash and S. Samet, "Feature Selection for Intrusion Detection Systems: A Comprehensive Review," Computer Networks, vol. 74, pp. 443-460, 2014.
[9] A. A. Wiharto and U. Permana, "Improvement of performance intrusion detection system (IDS) using artificial neural network ensemble," Journal of Theoretical and Applied Information Technology, vol. 80, no. 2, pp. 191-201, 2015.
[10] D. Uhm, S. H. Jun, and S. J. Lee, "A classification method using data reduction," International Journal of Fuzzy Logic and Intelligent Systems, vol. 12, no. 1, pp. 1-5, 2012.
[11] N. A. Le-Khac, M. Bue, M. Whelan, and M. T. Kechadi, "A clustering-based data reduction for very large spatiotemporal datasets," in Advanced Data Mining and Applications, 2010, pp. 43-54.
[12] J. Wang, S. Yue, X. Yu, and Y. Wang, "An efficient data reduction method and its application to cluster analysis," Neurocomputing, vol. 238, pp. 234-244, 2017.
[13] A. C. Benabdellah, A. Benghabrit, and I. Bouhaddou, "A survey of clustering algorithms for an industrial context," Procedia Computer Science, vol. 148, pp. 291-302, 2019.
[14] J. M. Dudik, A. Kurosu, J. L. Coyle, and E. Sejdic, "A comparative analysis of DBSCAN, K-means, and quadratic variation algorithms for automatic identification of swallows from swallowing accelerometry signals," Computers in Biology and Medicine, vol. 59, pp. 10-18, 2015.
[15] S. Ougiaroglou and G. Evangelidis, "Efficient dataset size reduction by finding homogeneous clusters," in Proceedings of the 5th Balkan Conference in Informatics, 2012, pp. 168-173.
[16] S. Ougiaroglou and G. Evangelidis, "RHC: a nonparametric cluster-based data reduction for efficient k-NN classification," Pattern Analysis and Applications, vol. 19, no. 1, pp. 93-109, 2016.
[17] S. Ougiaroglou, K. I. Diamantaras, and G. Evangelidis, "Exploring the effect of data reduction on neural network and support vector machine classification," Neurocomputing, vol. 208, pp. 101-110, 2018.
[18] S. Ougiaroglou and G. Evangelidis, "Efficient editing and data abstraction by finding homogeneous clusters," Annals of Mathematics and Artificial Intelligence, vol. 76, no. 3-4, pp. 327-349, 2016.
[19] A. K. Wicaksana and D. E. Cahyani, "Modification of a density-based spatial clustering algorithm for applications with noise for data reduction in intrusion detection systems," International Journal of Fuzzy Logic and Intelligent Systems, vol. 21, no. 2, pp. 189-203, 2021.
[20] F. O. Ozkok and M. Celik, "A new approach to determine Eps parameter of DBSCAN algorithm," International Journal of Intelligent Systems and Applications Engineering, vol. 5, no. 4, pp. 247-251, 2017.
[21] M. Ester, H.-P. Kriegel, J. Sander, and X. Xu, "A density-based algorithm for discovering clusters in large spatial databases with noise," in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 1996, pp. 226-231.
[22] M. Zbigniew, "Genetic algorithms + data structures = evolution programs," Comput Stat, 1996.
[23] L. Rutkowskia, M. Jaworskia, L. Pietruczuka, and P. Duda, "The CART Decision Tree for Mining Data Streams," Information Sciences, vol. 266, pp. 1-15, 2014.
[24] R. A. Johnson and D. W. Wichern, Applied Multivariate Statistical Analysis. Upper Saddle River, NJ: Pearson Prentice Hall, 2007.
[25] P. I. Radoglou-Grammatikis and P. G. Sarigiannidis, "An anomaly-based intrusion detection system for the smart grid based on CART decision tree," in Proceedings of the Global Information Infrastructure and Networking Symposium (GIIS), 2018.
[26] H. H. Patel and P. Prajapati, "Study and analysis of decision tree based classification algorithms," International Journal of Computer Sciences and Engineering, vol. 6, no. 10, pp. 74-78, 2018.
[27] A. Priyam, G. R. Abhijeeta, A. Rathee, and S. Srivastava, "Comparative analysis of decision tree classification algorithms," International Journal of Current Engineering and Technology, vol. 3, no. 2, pp. 334-337, 2013.
[28] H. M. Sani, C. Lei, and D. Neagu, "Computational complexity analysis of decision tree algorithms," in Proceedings of the Artificial Intelligence XXXV, 2018, pp. 191-197.
[29] A. Zadedehbalaei, A. Bagheri, and H. Afshar, "A study on DBSCAN Clustering algorithm issues and a survey on its improvements," Soft Computing Journal, vol. 6, 2021, pp. 2322-3707.
[30] S. Bhattacharya, P.K.R. Maddikunta, R. Kaluri, S. Singh, T.R. Gadekallu, M. Alazab, U. Tariq, "A novel PCA-firefly based XGBoost classification model for intrusion detection in networks using GPU," Electronics, vol. 9, no. 2, 2020.
[31] I.H. Sarker, Y.B. Abushark, F. Alsolami, A.I. Khan, "IntruDTree: a machine learning based cyber security intrusion detection model," Symmetry, vol. 12, no. 5, 2020.
[32] M.B. Shahbaz, X. Wang, A. Behnad, J. Samarabandu, "On efficiency enhancement of the correlation-based feature selection for intrusion detection systems," in Proceedings of the 2016 IEEE 7th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), Vancouver, BC, Canada, 2016, pp. 1–7.
[33] M. Abdullah, A. Balamash, A. Alshannaq, S. Almabdy, "Enhanced Intrusion Detection System using Feature Selection Method and Ensemble Learning Algorithms," International Journal of Computer Science and Information Security (IJCSIS), vol. 16, no. 2, 2018, pp. 48–55.
[34] H. Alazzam, A. Sharieh, K.E. Sabri, "A feature selection algorithm for intrusion detection system based on Pigeon Inspired Optimizer," Expert Systems with Applications, vol. 148, 2020, 113249.
[35] S. M. Kasongo, "A deep learning technique for intrusion detection system using a recurrent neural networks based framework," Computer Communications, vol. 199, no. 1, pp. 113–125, 2023
Volume 12, Issue 4 - Serial Number 48
Winter
February 2025
Pages 89-102
  • Receive Date: 05 October 2024
  • Revise Date: 12 December 2024
  • Accept Date: 12 January 2025
  • Publish Date: 01 February 2025