آموزش در داده‌های نامتوازن با استفاده از روش آموزش با داده محدود مبتنی بر فیلتر DEKF برای بهبود تشخیص نفوذ حملات سایپری

صالح, فتاة; داداش تبار احمدی, کوروش; کیوان راد, محمدعلی

آموزش در داده‌های نامتوازن با استفاده از روش آموزش با داده محدود مبتنی بر فیلتر DEKF برای بهبود تشخیص نفوذ حملات سایپری

نوع مقاله : مقاله پژوهشی

نویسندگان

¹ دانشجوی دکتری،دانشگاه صنعتی مالک اشتر ، تهران، ایران

² استادیار،دانشگاه صنعتی مالک اشتر ، تهران، ایران

چکیده

در حالی که تکنیک‌های یادگیری ماشین به منظور بهبود سیستم تشخیص نفوذ حملات سایبری به شکلی گسترده مورد استفاده قرار می‌گیرند، چالش‌های متعدد، به ویژه در زمینه مدیریت مجموعه‌داده‌‌های نامتوازن و تشخیص حملات نادر مانند R2L و U2R به دلیل ناکافی بودن تعداد نمونه‌های آنها در مجموعه‌داده‌‌های آموزشی، به قوت خود باقی هستند. مجموعه‌داده‌‌های نامتوازن، چالشی رایج و همیشگی در هنگام ارزیابی عملکرد سیستم تشخیص نفوذ بوده و اغلب منجر به انحراف به سمت کلاس اکثریت می‌شوند؛ این عامل به نوبه خود، مانع از شناسایی حملات کلاس‌های اقلیت خواهد شد. طبقه‌بندی کننده‌های نیز یادگیری ماشین که عمدتاً مبتنی بر دقت هستند، قادر به شناسایی حملات سایبری نادر نخواهند بود. به علاوه، همپوشانی کلاس‌ها انتخاب ویژگی را پیچیده‌تر کرده و مانع تشخیص دقیق نفوذ خواهد ‌شود. ما در این مقاله، برای مقابله با این چالش‌ها، راهکاری ارائه می‌کنیم که منشأ آن در آموزش با داده محدود، به ویژه یادگیری متا مدل آگنوستیک MAML نهفته است. الگوریتم MAML سنتی محدودیت‌هایی دارد که از آن جمله می‌توان به همگرایی کُند و الزامات محاسباتی اشاره کرد. به منظور ارتقای عملکرد MAML، این مقاله فیلتر کالمن گسترش‌یافته جداشده از گره NDEKF را به عنوان جایگزینی برای گرادیان نزولی در حلقۀ داخلی معرفی می‌کند. الگوریتم NDEKF باعث بهینه‌سازی آموزش MAML، تسریع همگرایی و بهبود تعمیم خواهد شد. فیلتر فوق محاسبات را ساده‌تر و آن را برای شبکه‌های عصبی عمیق بهینه‌سازی می‌کند. برای حل معضل داده‌های نامتوازن در سیستم تشخیص نفوذ از ترکیب دو الگوریتم MAML و NDEKF تحت عنوان MAML- NDEKF استفاده شده است. این رویکرد پیشنهادی، بر روی مجموعه‌داده‌ NSL-KDD ارزیابی می‌شود. بعد از اعمال این رویکرد، همگرایی به سرعت بهبود می‌یابد، قابلیت تعمیم افزایش پیدا می‌کند و در مقایسه با الگوریتم اصلی MAML، هنگام رویارویی با مجموعه‌داده‌ پراکنده و ناپایداری مانند NSL-KDD دقت بالاتری حاصل می‌گردد. چارچوب پیشنهادی ما به طور خاص، پیشرفت‌های قابل توجهی در زمینه تشخیص دقیق حملات R2L و U2R نشان داده است. نرخ دقت تشخیص حملات R2L و U2R در این رویکرد علی‌رغم کاهش تعداد دوره‌های آموزش، بیشتر از MAML اصلی بوده و به ترتیب از 61% به 75% و از 51% به 66% افزایش یافته است.

کلیدواژه‌ها

20.1001.1.23224347.1403.12.3.10.2

موضوعات

دفاع سایبری

عنوان مقاله [English]

DEKF-Driven Few-Shot Class Imbalance Learning to Enhance Cyber Attack Detection

نویسندگان [English]

Fatat Saleh ¹
Kourosh Dadashtabar Ahmadi ²
Mohammad Ali Keyvanrad ²

¹ PhD student, Malek Ashtar University of Technology , Tehran, Iran

² Assistant Professor, Malek Ashtar University of Technology, Tehran, Iran

چکیده [English]

The escalating use of networks and the internet has led to a surge in cyber threats, making it imperative to develop sophisticated intrusion detection systems (IDS) capable of safeguarding against these malicious intrusions. While machine learning techniques have been extensively employed to enhance IDS, challenges persist, notably in handling imbalanced datasets and rare attack detection such as R2L and U2R due to the small number of their samples in the training dataset. Imbalanced datasets, a common challenge in IDS evaluation, often skew toward majority classes, hindering the detection of minority class attacks. Existing machine learning classifiers, primarily accuracy-driven, struggle to excel at identifying rare attacks, which are often more catastrophic. Moreover, overlapping classes complicate feature selection, further impeding accurate detection. To tackle these challenges, this article proposes a solution rooted in Few-Shot Learning, particularly MAML. Traditional MAML has limitations, including slow convergence and computational demands. To enhance MAML's performance, the article introduces the Node Decoupled Extended Kalman Filter (NDEKF) as an alternative to gradient descent in the inner loop. NDEKF optimizes MAML training, offering faster convergence and improved generalization. The DEKF (Decoupled Extended Kalman Filter) variant simplifies calculations, making it suitable for deep neural networks. The combination of MAML and NDEKF, termed NDEKF-based MAML, is applied to address the imbalanced data problem in IDS. The proposed approach is evaluated on the NSL-KDD dataset, demonstrating its potential to improve rare attack detection in intrusion detection systems. By adopting this approach, we achieved improved convergence speed, enhanced ability to generalize, and higher accuracy compared to the original MAML algorithm when dealing with a sparse and unstable dataset such as NSL-KDD. Particularly, our framework demonstrated significant advancements in accurately detecting rare U2R and R2L attacks. The accuracy rates for R2L and U2R attacks using our proposed framework surpassed those of the original MAML, increasing from 61% to 75% and from 51% to 66%, respectively, even with a reduced number of training epochs.

کلیدواژه‌ها [English]

: Intrusion detection system
Class Imbalance Learning
Few Shot Learning MAML
Decoupled Extended Kalman Filter

مراجع

[1] Mazloum, H. Bigdeli, “An Optimized Compound Deep Neural Network Integrating With Feature Selection for Intrusion Detection System in Cyber Attacks,” Electronic and Cyber Defense, vol. 10, no. 4, pp. 41-51, 2023 (In Persian). https://dorl.net/dor/20.1001.1.23224347.1401.10.4.5.5

[2] M. Hassan Nataj Solhdar, “Investigation of a new ensemble method of intrusion detection system on different data sets,” Electronic and Cyber Defense, vol. 10, no. 3, pp. 43-57, 2022 (In Persian).https://dor.isc.ac/dor/20.1001.1.23224347.1401.10.3.5.3

[3] H. He, and E.A. Garcia, “Learning from imbalanced data,” IEEE Transactions on knowledge and data engineering, vol. 21, no. 9. pp. 1263-1284. 2009. https://doi.org/10.1109/TKDE.2008.239

[4] K.-C. Khor, C.-Y. Ting, and S. Phon-Amnuaisuk, “The effectiveness of sampling methods for the imbalanced network intrusion detection data set,” Recent Advances on Soft Computing and Data Mining, Springer. pp. 613-622, 2014. https://doi.org/10.1007/978-3-319-07692-8_58

[5] M. Ring, S. Wunderlich, D. Scheuring, D. Landes, and A. Hotho, “A survey of network-based intrusion detection data sets,” Computers & Security, vol. 86. pp.147-167. 2019. https://doi.org/10.1016/j.cose.2019.06.005

[6] S. Visa, and A. Ralescu. “Issues in mining imbalanced data sets-a review paper,” in Proceedings of the sixteen midwest artificial intelligence and cognitive science conference, vol. 2005, pp. 67-73. sn. 2005.

[7] Y. Liu, T. Gao, and H. Yang, “Selectnet: Learning to sample from the wild for imbalanced data training,” Mathematical and Scientific Machine Learning. PMLR. Vol. 107. pp. 193-206. 2020.

[8] H. Liu, and B. Lang, “Machine learning and deep learning methods for intrusion detection systems: A survey,” Applied Sciences, vol. 9, no. 20: p. 4396, 2019. https://doi.org/10.3390/app9204396

[9] G.S. Dhillon, P.Chaudhari, A. Ravichandran, and S. Soatto, “A baseline for few-shot image classification,” in Proceedings of the International Conference on Learning Representations (ICLR), 2020.

[10] C. Finn, P. Abbeel, and S. Levine. “Model-agnostic meta-learning for fast adaptation of deep networks,” in Proceedings of 34th International conference on machine learning. vol. 70. pp 1126–1135. 2017. https://dl.acm.org/doi/10.5555/3305381.3305498

[11] G. Koch, R. Zemel, and R. Salakhutdinov, “Siamese neural networks for one-shot image recognition,” ICML deep learning workshop. 2015.

[12] H. Tanha, M. Abbasi, “Identify malicious traffic on IoT infrastructure using neural networks and deep learning,” Electronic & Cyber Defence, vol. 11, no. 2, 2023 (In Persian). https://dorl.net/dor/20.1001.1.23224347.1402.11.2.1.4

[13] A. A. Tajari Siahmarzkooh, “Intrusion Detection in Computer Networks Using Decision Tree and Feature Reduction,” Electronic & Cyber Defence, vol. 9, no. 3, 2021 (In Persian). https://dorl.net/dor/ 20.1001.1.23224347.1400.9.3.8.9

[14] S. Thrun, and L. Pratt, “Learning to learn: Introduction and overview,” in Learning to learn, Springer. pp. 3-17, 1998. https://doi.org/10.1007/978-1-4615-5529-2_1

[15] A. Rajeswaran, C. Finn, S.M. Kakade, and S. Levine, “Meta-learning with implicit gradients,” Advances in neural information processing systems, vol. 32, 2019.

[16] A. Nichol, J. Achiam, and J. Schulman, “On first-order meta-learning algorithms,” arXiv preprint arXiv:1803.02999, 2018. https://doi.org/10.48550/arXiv.1803.02999

[17] A. Antoniou, H. Edwards, and A. Storkey, “How to train your MAML,” in Proceedings of the International Conference on Learning Representations (ICLR) 2019.

[18] J. Chen, W. Yuan, S. Chen, Z. Hu, and P. Li, “Evo-MAML: Meta-Learning with Evolving Gradient,” Electronics, vol. 12, no. 18: p. 3865, 2023. https://doi.org/10.3390/electronics12183865

[19] S.S. Haykin, and S.S. Haykin, “Kalman filtering and neural networks,” vol. 284: Wiley Online Library, 2001. https://doi.org/10.1002/0471221546

[20] B.C. Aissa, and C. Fatima, “Neural Networks Trained with Levenberg-Marquardt-Iterated Extended Kalman Filter for Mobile Robot Trajectory Tracking,” Journal of Engineering Science & Technology Review, vol. 10, no. 4, 2017. https://doi.org/10.25103/jestr.104.23

[21] I. Yaesh, and N. Grinfeld, “Training without Gradients--A Filtering Approach,” arXiv preprint arXiv:2010.04908, 2020. https://doi.org/10.48550/arXiv.2010.04908

[22] Y. Ollivier, “Online natural gradient as a Kalman filter,” Electronic Journal of Statistics, vol. 12, no. 2: pp. 2930-2961, 2018. https://doi.org/10.1214/18-EJS1468

[23] L. Luttmann, and P. Mercorelli, “Comparison of Backpropagation and Kalman Filter-based Training for Neural Networks,” 25th International Conference on System Theory, Control and Computing (ICSTCC), 2021. https://doi.org/10.1109/ICSTCC52150.2021.9607274

[24] J.A. Pérez-Ortiz, F.A. Gers, D. Eck, and J. Schmidhuber, “Kalman filters improve LSTM network performance in problems unsolvable by traditional recurrent nets,” Neural Networks, vol. 16, no. 2: pp. 241-250, 2003. https://doi.org/10.1016/S0893-6080(02)00219-8

[25] G.V. Puskorius, and L.A. Feldkamp, “Decoupled extended Kalman filter training of feedforward layered networks,” IJCNN-91-Seattle International Joint Conference on Neural Networks, 1991. https://doi.org/10.1109/IJCNN.1991.155276

[26] W.L. Al-Yaseen, Z.A. Othman, and M.Z.A. Nazri, “Multi-level hybrid support vector machine and extreme learning machine based on modified K-means for intrusion detection system,” Expert Systems with Applications, vol. 67: pp. 296-303, 2017. https://doi.org/10.1016/j.eswa.2016.09.041

[27] N. Shone, T.N. Ngoc, V.D. Phai, and Q. Shi, “A deep learning approach to network intrusion detection,” IEEE Transactions on Emerging Topics in Computational Intelligence, vol. 2, no.1: pp. 41-50, 2018. https://doi.org/10.1109/TETCI.2017.2772792

[28] B. Yan, G. Han, M. Sun, and S. Ye, “A novel region adaptive SMOTE algorithm for intrusion detection on imbalanced problem,” 3rd IEEE International Conference on Computer and Communications (ICCC), 2017. https://doi.org/10.1109/CompComm.2017.8322749

[29] J. Zhang, M. Zulkernine, and A. Haque, “Random-forests-based network intrusion detection systems,” IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 38, no. 5: pp. 649-659, 2008. https://doi.org/10.1109/TSMCC.2008.923876

[30] Y. Wang, Y. Shen, and G. Zhang. “Research on intrusion detection model using ensemble learning methods,” 7th IEEE International Conference on Software Engineering and Service Science (ICSESS), 2016. https://doi.org/10.1109/ICSESS.2016.7883100

[31] Y. Yang, K. Zheng, C. Wu, and Y. Yang, “Improving the classification effectiveness of intrusion detection by using improved conditional variational autoencoder and deep neural network,” Sensors, vol. 19, no. 11, p.2528, 2019. https://doi.org/10.3390/s19112528

[32] B.M. Lake, R. Salakhutdinov, and J.B. Tenenbaum, “Human-level concept learning through probabilistic program induction,” Science, vol. 350, no. 6266: pp. 1332-1338, 2015. https://doi.org/10.1126/science.aab3050

[33] O. Vinyals, C. Blundell, T. Lillicrap, and D. Wierstra, “Matching networks for one shot learning,” Advances in neural information processing systems, 2016. https://dl.acm.org/doi/10.5555/3157382.3157504

[34] J. Snell, K. Swersky, and R. Zemel. “Prototypical networks for few-shot learning,” Advances in neural information processing systems, 2017. https://dl.acm.org/doi/10.5555/3294996.3295163

[35] L.Y. Gui, Y.X. Wang, D. Ramanan, and J.M. Moura, “Few-shot human motion prediction via meta-learning,” Proceedings of the European Conference on Computer Vision (ECCV). 2018. https://doi.org/10.1007/978-3-030-01237-3_27

[36] B.D. Anderson, and J.B. Moore, “Optimal filtering,” Courier Corporation. 2012. https://doi.org/10.1109/TSMC.1982.4308806

[37] A. Gelb (ed.) , “Applied optimal estimation,” MIT press. 1974.

[38] Y. Iiguni, H. Sakai, and H. Tokumaru, “A real-time learning algorithm for a multilayered neural network based on the extended Kalman filter,” IEEE Transactions on Signal processing, vol. 40, no. 4: pp. 959-966, 1992. https://doi.org/10.1109/78.127966

[39] Y. Shao, F.M. Dietrich, C. Nettelblad, and C. Zhang, “Training algorithm matters for the performance of neural network potential: A case study of Adam and the Kalman filter optimizers,” The Journal of Chemical Physics, vol. 155, no. 20, 2021. https://doi.org/10.1063/5.0070931

[40] C. Jin, S. Jang, X. Sun, J. Li, and R. Christenson, “Damage detection of a highway bridge under severe temperature changes using extended Kalman filter trained neural network,” Journal of Civil Structural Health Monitoring, vol. 6, no. 3: pp. 545-560, 2016. https://doi.org/10.1007/s13349-016-0173-8

[41] Q. Li, Z.Y. Wu, and A. Rahman, “Evolutionary deep learning with extended Kalman filter for effective prediction modeling and efficient data assimilation,” Journal of Computing in Civil Engineering, vol. 33, no. 3, 2019. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000835

[42] N.M. Vural, S. Ergüt, and S.S. Kozat, “An efficient and effective second-order training algorithm for lstm-based adaptive learning,” IEEE Transactions on Signal Processing, vol. 69 pp. 2541-2554, 2021. https://doi.org/10.1109/TSP.2021.3071566

[43] M.C. Nechyba, and Y. Xu. “Cascade neural networks with node-decoupled extended Kalman filtering,” IEEE International Symposium on Computational Intelligence in Robotics and Automation CIRA'97. ‘Towards New Computational Principles for Robotics and Automation’, 1997. https://doi.org/10.1109/CIRA.1997.613860

[44] N.M.Vural, F. Ilhan, and S.S. Kozat, “Stability of the Decoupled Extended Kalman Filter Learning Algorithm in LSTM-Based Online Learning,” arXiv preprint arXiv:1911.12258, 2019. https://doi.org/10.48550/arXiv.1911.12258

[45] S. Feng, X.Li, S. Zhang, Z. Jian, H. Duan, and Z. Wang, “A review: state estimation based on hybrid models of Kalman filter and neural network,” Systems Science & Control Engineering, vol. 11, no. 1, 2023. https://doi.org/10.1080/21642583.2023.2173682

[46] S. Murtuza, and S. Chorian, “Node decoupled extended Kalman filter based learning algorithm for neural networks,” 9th IEEE International Symposium on Intelligent Control, 1994. https://doi.org/10.1109/ISIC.1994.367790

[47] M. Tavallaee, E. Bagheri, W. Lu, and A.A. Ghorbani, “A detailed analysis of the KDD CUP 99 data set,” IEEE symposium on computational intelligence for security and defense applications, 2009. https://doi.org/10.1109/CISDA.2009.5356528