DEKF-Driven Few-Shot Class Imbalance Learning to Enhance Cyber Attack Detection

Saleh, Fatat; Dadashtabar Ahmadi, Kourosh; Keyvanrad, Mohammad Ali

DEKF-Driven Few-Shot Class Imbalance Learning to Enhance Cyber Attack Detection

Document Type : Original Article

Authors

¹ PhD student, Malek Ashtar University of Technology , Tehran, Iran

² Assistant Professor, Malek Ashtar University of Technology, Tehran, Iran

Abstract

The escalating use of networks and the internet has led to a surge in cyber threats, making it imperative to develop sophisticated intrusion detection systems (IDS) capable of safeguarding against these malicious intrusions. While machine learning techniques have been extensively employed to enhance IDS, challenges persist, notably in handling imbalanced datasets and rare attack detection such as R2L and U2R due to the small number of their samples in the training dataset. Imbalanced datasets, a common challenge in IDS evaluation, often skew toward majority classes, hindering the detection of minority class attacks. Existing machine learning classifiers, primarily accuracy-driven, struggle to excel at identifying rare attacks, which are often more catastrophic. Moreover, overlapping classes complicate feature selection, further impeding accurate detection. To tackle these challenges, this article proposes a solution rooted in Few-Shot Learning, particularly MAML. Traditional MAML has limitations, including slow convergence and computational demands. To enhance MAML's performance, the article introduces the Node Decoupled Extended Kalman Filter (NDEKF) as an alternative to gradient descent in the inner loop. NDEKF optimizes MAML training, offering faster convergence and improved generalization. The DEKF (Decoupled Extended Kalman Filter) variant simplifies calculations, making it suitable for deep neural networks. The combination of MAML and NDEKF, termed NDEKF-based MAML, is applied to address the imbalanced data problem in IDS. The proposed approach is evaluated on the NSL-KDD dataset, demonstrating its potential to improve rare attack detection in intrusion detection systems. By adopting this approach, we achieved improved convergence speed, enhanced ability to generalize, and higher accuracy compared to the original MAML algorithm when dealing with a sparse and unstable dataset such as NSL-KDD. Particularly, our framework demonstrated significant advancements in accurately detecting rare U2R and R2L attacks. The accuracy rates for R2L and U2R attacks using our proposed framework surpassed those of the original MAML, increasing from 61% to 75% and from 51% to 66%, respectively, even with a reduced number of training epochs.

Keywords

20.1001.1.23224347.1403.12.3.10.2

Main Subjects

Cyber defense

References

[1] Mazloum, H. Bigdeli, “An Optimized Compound Deep Neural Network Integrating With Feature Selection for Intrusion Detection System in Cyber Attacks,” Electronic and Cyber Defense, vol. 10, no. 4, pp. 41-51, 2023 (In Persian). https://dorl.net/dor/20.1001.1.23224347.1401.10.4.5.5

[2] M. Hassan Nataj Solhdar, “Investigation of a new ensemble method of intrusion detection system on different data sets,” Electronic and Cyber Defense, vol. 10, no. 3, pp. 43-57, 2022 (In Persian).https://dor.isc.ac/dor/20.1001.1.23224347.1401.10.3.5.3

[3] H. He, and E.A. Garcia, “Learning from imbalanced data,” IEEE Transactions on knowledge and data engineering, vol. 21, no. 9. pp. 1263-1284. 2009. https://doi.org/10.1109/TKDE.2008.239

[4] K.-C. Khor, C.-Y. Ting, and S. Phon-Amnuaisuk, “The effectiveness of sampling methods for the imbalanced network intrusion detection data set,” Recent Advances on Soft Computing and Data Mining, Springer. pp. 613-622, 2014. https://doi.org/10.1007/978-3-319-07692-8_58

[5] M. Ring, S. Wunderlich, D. Scheuring, D. Landes, and A. Hotho, “A survey of network-based intrusion detection data sets,” Computers & Security, vol. 86. pp.147-167. 2019. https://doi.org/10.1016/j.cose.2019.06.005

[6] S. Visa, and A. Ralescu. “Issues in mining imbalanced data sets-a review paper,” in Proceedings of the sixteen midwest artificial intelligence and cognitive science conference, vol. 2005, pp. 67-73. sn. 2005.

[7] Y. Liu, T. Gao, and H. Yang, “Selectnet: Learning to sample from the wild for imbalanced data training,” Mathematical and Scientific Machine Learning. PMLR. Vol. 107. pp. 193-206. 2020.

[8] H. Liu, and B. Lang, “Machine learning and deep learning methods for intrusion detection systems: A survey,” Applied Sciences, vol. 9, no. 20: p. 4396, 2019. https://doi.org/10.3390/app9204396

[9] G.S. Dhillon, P.Chaudhari, A. Ravichandran, and S. Soatto, “A baseline for few-shot image classification,” in Proceedings of the International Conference on Learning Representations (ICLR), 2020.

[10] C. Finn, P. Abbeel, and S. Levine. “Model-agnostic meta-learning for fast adaptation of deep networks,” in Proceedings of 34th International conference on machine learning. vol. 70. pp 1126–1135. 2017. https://dl.acm.org/doi/10.5555/3305381.3305498

[11] G. Koch, R. Zemel, and R. Salakhutdinov, “Siamese neural networks for one-shot image recognition,” ICML deep learning workshop. 2015.

[12] H. Tanha, M. Abbasi, “Identify malicious traffic on IoT infrastructure using neural networks and deep learning,” Electronic & Cyber Defence, vol. 11, no. 2, 2023 (In Persian). https://dorl.net/dor/20.1001.1.23224347.1402.11.2.1.4

[13] A. A. Tajari Siahmarzkooh, “Intrusion Detection in Computer Networks Using Decision Tree and Feature Reduction,” Electronic & Cyber Defence, vol. 9, no. 3, 2021 (In Persian). https://dorl.net/dor/ 20.1001.1.23224347.1400.9.3.8.9

[14] S. Thrun, and L. Pratt, “Learning to learn: Introduction and overview,” in Learning to learn, Springer. pp. 3-17, 1998. https://doi.org/10.1007/978-1-4615-5529-2_1

[15] A. Rajeswaran, C. Finn, S.M. Kakade, and S. Levine, “Meta-learning with implicit gradients,” Advances in neural information processing systems, vol. 32, 2019.

[16] A. Nichol, J. Achiam, and J. Schulman, “On first-order meta-learning algorithms,” arXiv preprint arXiv:1803.02999, 2018. https://doi.org/10.48550/arXiv.1803.02999

[17] A. Antoniou, H. Edwards, and A. Storkey, “How to train your MAML,” in Proceedings of the International Conference on Learning Representations (ICLR) 2019.

[18] J. Chen, W. Yuan, S. Chen, Z. Hu, and P. Li, “Evo-MAML: Meta-Learning with Evolving Gradient,” Electronics, vol. 12, no. 18: p. 3865, 2023. https://doi.org/10.3390/electronics12183865

[19] S.S. Haykin, and S.S. Haykin, “Kalman filtering and neural networks,” vol. 284: Wiley Online Library, 2001. https://doi.org/10.1002/0471221546

[20] B.C. Aissa, and C. Fatima, “Neural Networks Trained with Levenberg-Marquardt-Iterated Extended Kalman Filter for Mobile Robot Trajectory Tracking,” Journal of Engineering Science & Technology Review, vol. 10, no. 4, 2017. https://doi.org/10.25103/jestr.104.23

[21] I. Yaesh, and N. Grinfeld, “Training without Gradients--A Filtering Approach,” arXiv preprint arXiv:2010.04908, 2020. https://doi.org/10.48550/arXiv.2010.04908

[22] Y. Ollivier, “Online natural gradient as a Kalman filter,” Electronic Journal of Statistics, vol. 12, no. 2: pp. 2930-2961, 2018. https://doi.org/10.1214/18-EJS1468

[23] L. Luttmann, and P. Mercorelli, “Comparison of Backpropagation and Kalman Filter-based Training for Neural Networks,” 25th International Conference on System Theory, Control and Computing (ICSTCC), 2021. https://doi.org/10.1109/ICSTCC52150.2021.9607274

[24] J.A. Pérez-Ortiz, F.A. Gers, D. Eck, and J. Schmidhuber, “Kalman filters improve LSTM network performance in problems unsolvable by traditional recurrent nets,” Neural Networks, vol. 16, no. 2: pp. 241-250, 2003. https://doi.org/10.1016/S0893-6080(02)00219-8

[25] G.V. Puskorius, and L.A. Feldkamp, “Decoupled extended Kalman filter training of feedforward layered networks,” IJCNN-91-Seattle International Joint Conference on Neural Networks, 1991. https://doi.org/10.1109/IJCNN.1991.155276

[26] W.L. Al-Yaseen, Z.A. Othman, and M.Z.A. Nazri, “Multi-level hybrid support vector machine and extreme learning machine based on modified K-means for intrusion detection system,” Expert Systems with Applications, vol. 67: pp. 296-303, 2017. https://doi.org/10.1016/j.eswa.2016.09.041

[27] N. Shone, T.N. Ngoc, V.D. Phai, and Q. Shi, “A deep learning approach to network intrusion detection,” IEEE Transactions on Emerging Topics in Computational Intelligence, vol. 2, no.1: pp. 41-50, 2018. https://doi.org/10.1109/TETCI.2017.2772792

[28] B. Yan, G. Han, M. Sun, and S. Ye, “A novel region adaptive SMOTE algorithm for intrusion detection on imbalanced problem,” 3rd IEEE International Conference on Computer and Communications (ICCC), 2017. https://doi.org/10.1109/CompComm.2017.8322749

[29] J. Zhang, M. Zulkernine, and A. Haque, “Random-forests-based network intrusion detection systems,” IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 38, no. 5: pp. 649-659, 2008. https://doi.org/10.1109/TSMCC.2008.923876

[30] Y. Wang, Y. Shen, and G. Zhang. “Research on intrusion detection model using ensemble learning methods,” 7th IEEE International Conference on Software Engineering and Service Science (ICSESS), 2016. https://doi.org/10.1109/ICSESS.2016.7883100

[31] Y. Yang, K. Zheng, C. Wu, and Y. Yang, “Improving the classification effectiveness of intrusion detection by using improved conditional variational autoencoder and deep neural network,” Sensors, vol. 19, no. 11, p.2528, 2019. https://doi.org/10.3390/s19112528

[32] B.M. Lake, R. Salakhutdinov, and J.B. Tenenbaum, “Human-level concept learning through probabilistic program induction,” Science, vol. 350, no. 6266: pp. 1332-1338, 2015. https://doi.org/10.1126/science.aab3050

[33] O. Vinyals, C. Blundell, T. Lillicrap, and D. Wierstra, “Matching networks for one shot learning,” Advances in neural information processing systems, 2016. https://dl.acm.org/doi/10.5555/3157382.3157504

[34] J. Snell, K. Swersky, and R. Zemel. “Prototypical networks for few-shot learning,” Advances in neural information processing systems, 2017. https://dl.acm.org/doi/10.5555/3294996.3295163

[35] L.Y. Gui, Y.X. Wang, D. Ramanan, and J.M. Moura, “Few-shot human motion prediction via meta-learning,” Proceedings of the European Conference on Computer Vision (ECCV). 2018. https://doi.org/10.1007/978-3-030-01237-3_27

[36] B.D. Anderson, and J.B. Moore, “Optimal filtering,” Courier Corporation. 2012. https://doi.org/10.1109/TSMC.1982.4308806

[37] A. Gelb (ed.) , “Applied optimal estimation,” MIT press. 1974.

[38] Y. Iiguni, H. Sakai, and H. Tokumaru, “A real-time learning algorithm for a multilayered neural network based on the extended Kalman filter,” IEEE Transactions on Signal processing, vol. 40, no. 4: pp. 959-966, 1992. https://doi.org/10.1109/78.127966

[39] Y. Shao, F.M. Dietrich, C. Nettelblad, and C. Zhang, “Training algorithm matters for the performance of neural network potential: A case study of Adam and the Kalman filter optimizers,” The Journal of Chemical Physics, vol. 155, no. 20, 2021. https://doi.org/10.1063/5.0070931

[40] C. Jin, S. Jang, X. Sun, J. Li, and R. Christenson, “Damage detection of a highway bridge under severe temperature changes using extended Kalman filter trained neural network,” Journal of Civil Structural Health Monitoring, vol. 6, no. 3: pp. 545-560, 2016. https://doi.org/10.1007/s13349-016-0173-8

[41] Q. Li, Z.Y. Wu, and A. Rahman, “Evolutionary deep learning with extended Kalman filter for effective prediction modeling and efficient data assimilation,” Journal of Computing in Civil Engineering, vol. 33, no. 3, 2019. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000835

[42] N.M. Vural, S. Ergüt, and S.S. Kozat, “An efficient and effective second-order training algorithm for lstm-based adaptive learning,” IEEE Transactions on Signal Processing, vol. 69 pp. 2541-2554, 2021. https://doi.org/10.1109/TSP.2021.3071566

[43] M.C. Nechyba, and Y. Xu. “Cascade neural networks with node-decoupled extended Kalman filtering,” IEEE International Symposium on Computational Intelligence in Robotics and Automation CIRA'97. ‘Towards New Computational Principles for Robotics and Automation’, 1997. https://doi.org/10.1109/CIRA.1997.613860

[44] N.M.Vural, F. Ilhan, and S.S. Kozat, “Stability of the Decoupled Extended Kalman Filter Learning Algorithm in LSTM-Based Online Learning,” arXiv preprint arXiv:1911.12258, 2019. https://doi.org/10.48550/arXiv.1911.12258

[45] S. Feng, X.Li, S. Zhang, Z. Jian, H. Duan, and Z. Wang, “A review: state estimation based on hybrid models of Kalman filter and neural network,” Systems Science & Control Engineering, vol. 11, no. 1, 2023. https://doi.org/10.1080/21642583.2023.2173682

[46] S. Murtuza, and S. Chorian, “Node decoupled extended Kalman filter based learning algorithm for neural networks,” 9th IEEE International Symposium on Intelligent Control, 1994. https://doi.org/10.1109/ISIC.1994.367790

[47] M. Tavallaee, E. Bagheri, W. Lu, and A.A. Ghorbani, “A detailed analysis of the KDD CUP 99 data set,” IEEE symposium on computational intelligence for security and defense applications, 2009. https://doi.org/10.1109/CISDA.2009.5356528

DEKF-Driven Few-Shot Class Imbalance Learning to Enhance Cyber Attack Detection

References

Volume 12, Issue 3 - Serial Number 47
November 2024
Pages 119-137

Files

History

Share

How to cite

Statistics

DEKF-Driven Few-Shot Class Imbalance Learning to Enhance Cyber Attack Detection

References

Volume 12, Issue 3 - Serial Number 47November 2024Pages 119-137

Files

History

Share

How to cite

Statistics

Volume 12, Issue 3 - Serial Number 47
November 2024
Pages 119-137