تشخیص بدافزار با استفاده از یادگیری ائتلافی و یادگیری افزایشی

نوع مقاله : مقاله پژوهشی

نویسندگان

1 دانشجوی کارشناسی ارشد، دانشگاه بوعلی سینا همدان، همدان، ایران

2 استادیار، دانشگاه بوعلی سینا همدان، همدان، ایران

3 دانشیار، دانشگاه بوعلی سینا همدان، همدان، ایران

چکیده

دستگاه‌های تلفن همراه مبتنی بر اندروید به دلیل راحتی در استفاده کاربران بسیار زیادی دارند. افراد در تلفن‌های همراه خود کارهای مختلفی از جمله فعالیت‌های بانکی، فعالیت در شبکه‌های اجتماعی و سامانه‌های متعدد و متنوع کسب‌وکار را انجام می‌دهند و به همین دلیل اطلاعات شخصی زیادی از آن‌ها به دلیل آسیب‌پذیری سیستم‌عامل اندروید در معرض خطر قرار می‌گیرد. به دلیل توسعه سریع بدافزار‌های اندرویدی، بسیاری از روش‌های سنتی تشخیص بدافزار دقت خود را ازدست‌داده‌اند. تحقیقات نشان می‌دهند یادگیری ماشین یک روش مؤثر برای تشخیص بدافزار‌ها است. توسعه سریع بدافزار‌ها باعث می‌شود دقت مدل‌های یادگیری شده بعد از مدتی کاهش پیدا کند. همچنین با جمع‌آوری داده‌های مربوط به بدافزارها از دستگاه‌های اندرویدی حریم خصوصی کاربران به خطر می‌افتد. برای حل این مشکل در این مقاله از یادگیری افزایشی و ائتلافی (فدرال) استفاده شده است. اخیراً یادگیری ائتلافی برای آموزش مدل‌های یادگیری ماشین در دستگاه‌های غیرمتمرکز با‌هدف حفظ حریم خصوصی معرفی شده است. این مقاله از شبکه عصبی (MLP) در چارچوب یادگیری ائتلافی استفاده نموده است. برای یادگیری افزایشی از روش پشته‌ای که یکی از انواع یادگیری جمعی است استفاده شده است. در این پژوهش از مجموعه‌داده CICMalDroid 2020 استفاده شده و با استفاده از داده‌های ایستا، مدل نهایی ایجاد شده است. حاصل این پژوهش مدلی با دقت 49/96 است که مقایسه آن با روش‌های موجود نشانگر بهبود قابل‌توجه پیچیدگی زمانی محاسبات به همراه حفظ کیفیت یادگیری و دقت مدل‌هاست.

کلیدواژه‌ها

موضوعات


عنوان مقاله [English]

Malware detection using federated learning and incremental learning

نویسندگان [English]

  • mohammadali eftekhari 1
  • Morteza Yousef Sanati 2
  • Muharram Mansoorizadeh 3
1 Master's student, Boali Sina Hamedan University, Hamedan, Iran
2 Assistant Professor, Boali Sina University, Hamedan, Hamedan, Iran
3 Associate Professor, Boali Sina University, Hamedan, Hamedan, Iran
چکیده [English]

Android-based mobile devices are widely used due to their ease of use among users. Individuals perform various tasks on their mobile phones, such as banking activities, social networking, and diverse business systems, thereby exposing considerable personal information to risks due to the vulnerabilities of the Android operating system. The rapid development of Android malware has rendered many traditional malware detection methods less accurate over time. Research indicates that machine learning is an effective approach for detecting malware. The rapid evolution of malware contributes to the degradation of accuracy in trained models over time. Moreover, the collection of malware-related data from Android devices jeopardizes users' privacy. To address these issue, this paper employs federated and incremental learning. Recently, federated learning has been introduced for training machine learning models on decentralized devices with the aim of preserving privacy. This study utilizes a Multi-Layer Perceptron (MLP) within the framework of federated learning. Stacking, a type of ensemble learning, is employed for incremental learning. The CICMalDroid 2020 dataset is utilized in this research, using static data to develop the final model. The outcome of this study is a model with an accuracy of 96.49%, demonstrating significant improvement in computational time complexity along with maintaining the quality of learning and model accuracy compared to existing methods.

کلیدواژه‌ها [English]

  • malware detection
  • machine learning
  • federated learning
  • incremental learning
  • distribution

Smiley face

 

[1]           R. Taheri, M. Shojafar, M. Alazab, and R. Tafazolli, "FED-IIoT: A robust federated malware detection architecture in industrial IoT," IEEE transactions on industrial informatics, vol. 17, no. 12, pp. 8442-8452, 2020.
[2]           K. Liu, S. Xu, G. Xu, M. Zhang, D. Sun, and H. Liu, "A review of android malware detection approaches based on machine learning," IEEE Access, vol. 8, pp. 124579-124607, 2020.
[3]           S. Wu, P. Wang, X. Li, and Y. Zhang, "Effective detection of android malware based on the usage of data flow APIs and machine learning," Information and software technology, vol. 75, pp. 17-25, 2016.
[4]           R. Jin and B. Wang, "Malware detection for mobile devices using software-defined networking," in 2013 Second GENI research and educational experiment workshop, 2013: IEEE, pp. 81-88.
[5]           A. Wang, R. Liang, X. Liu, Y. Zhang, K. Chen, and J. Li, "An inside look at IoT malware," in Industrial IoT Technologies and Applications: Second EAI International Conference, Industrial IoT 2017, Wuhu, China, March 25–26, 2017, Proceedings 2, 2017: Springer, pp. 176-186.
[6]           P. Dahiya, "Malware detection in IoT," in Internet of Things: Security and Privacy in Cyberspace: Springer, 2022, pp. 133-164.
[7]           S. Qing, "Research progress on Android security," Journal of Software, vol. 27, no. 1, pp. 45-71, 2016.
[8]           M. T. Ahvanooey, Q. Li, M. Rabbani, and A. R. Rajput, "A survey on smartphones security: software vulnerabilities, malware, and attacks," arXiv preprint arXiv:2001.09406, 2020.
[9]           A. Souri and R. Hosseini, "A state-of-the-art survey of malware detection approaches using data mining techniques," Human-centric Computing and Information Sciences, vol. 8, no. 1, pp. 1-22, 2018.
[10]         D. E. García, N. DeCastro-García, and A. L. M. Castañeda, "An effectiveness analysis of transfer learning for the concept drift problem in malware detection," Expert Systems with Applications, vol. 212, p. 118724, 2023.
[11]         D. Arp, M. Spreitzenbarth, M. Hubner, H. Gascon, K. Rieck, and C. Siemens, "Drebin: Effective and explainable detection of android malware in your pocket," in Ndss, 2014, vol. 14, pp. 23-26.
[12]         Y. Zhou and X. Jiang, "Dissecting android malware: Characterization and evolution," in 2012 IEEE symposium on security and privacy, 2012: IEEE, pp. 95-109.
[13]         Y. Mirsky, A. Shabtai, L. Rokach, B. Shapira, and Y. Elovici, "Sherlock vs moriarty: A smartphone dataset for cybersecurity research," in Proceedings of the 2016 ACM workshop on Artificial intelligence and security, 2016, pp. 1-12.
[14]         C. I. f. Cybersecurity. "CICMalDroid 2020." https://www.unb.ca/cic/datasets/maldroid-2020.html (accessed 12/23/2023, 2023).
[15]         D. Curry. "Android statistics (2023)." https://www.businessofapps.com/data/android-statistics (accessed 12/7/2023, 2023).
[16]         P. Faruki et al., "Android security: a survey of issues, malware penetration, and defenses," IEEE communications surveys & tutorials, vol. 17, no. 2, pp. 998-1022, 2014.
[17]         J. Gama, I. Žliobaitė, A. Bifet, M. Pechenizkiy, and A. Bouchachia, "A survey on concept drift adaptation," ACM computing surveys (CSUR), vol. 46, no. 4, pp. 1-37, 2014.
[18]         R. Pari, M. Sandhya, and S. Sankar, "A multi-tier stacked ensemble algorithm to reduce the regret of incremental learning for streaming data," IEEE Access, vol. 6, pp. 48726-48739, 2018.
[19]         L. U. Memon, N. Z. Bawany, and J. A. Shamsi, "A comparison of machine learning techniques for android malware detection using apache spark," Journal of Engineering Science and Technology, vol. 14, no. 3, pp. 1572-1586, 2019.
[20]         M. Kumar, "Scalable malware detection system using big data and distributed machine learning approach," Soft Computing, vol. 26, no. 8, pp. 3987-4003, 2022.
[21]         A. Joshi and S. Kumar, "Stacking-based ensemble model for malware detection in android devices," International Journal of Information Technology, vol. 15, no. 6, pp. 2907-2915, 2023.
[22]         U. S. Jannat, S. M. Hasnayeen, M. K. B. Shuhan, and M. S. Ferdous, "Analysis and detection of malware in Android applications using machine learning," in 2019 International Conference on Electrical, Computer and Communication Engineering (ECCE), 2019: IEEE, pp. 1-7.
[23]         "Static Analysis of Malware and Benign apps 2017." https://www.kaggle.com/goorax/datasets (accessed 3/9/2024, 2024).
[24]         M. Chen, Q. Zhou, K. Wang, and Z. Zeng, "An Android Malware Detection Method Using Deep Learning based on Multi-features," in 2022 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA), 2022: IEEE, pp. 187-190.
[25]         M. Schuster and K. K. Paliwal, "Bidirectional recurrent neural networks," IEEE transactions on Signal Processing, vol. 45, no. 11, pp. 2673-2681, 1997.
[28]         A. Ezzatneshan, T. F. S. Kamel, and R. Ghaemi, "Presentation of a New Solution to Botnet Detection in a Markov Chain-Based Network," 2021.
[29]         M. Mosleh and M. Karami, "Presenting a Malware Detection System by Implementing Hardware Counters Based on the Multi-Layer Perceptron Neural Network (MLP) and the Dragonfly Optimization Algorithm," 2021.
[30]         R.-H. Hsu et al., "A privacy-preserving federated learning system for android malware detection based on edge computing," in 2020 15th Asia Joint Conference on Information Security (AsiaJCIS), 2020: IEEE, pp. 128-136.
[31]         R. Gálvez, V. Moonsamy, and C. Diaz, "Less is More: A privacy-respecting Android malware classifier using federated learning," arXiv preprint arXiv:2007.08319, 2020.
[35]         S. Mahdavifar, D. Alhadidi, and A. A. Ghorbani, "Effective and efficient hybrid android malware classification using pseudo-label stacked auto-encoder," Journal of network and systems management, vol. 30, pp. 1-34, 2022.
[36]         S. Doraisamy, S. Golzari, N. Mohd, M. N. Sulaiman, and N. I. Udzir, "A Study on Feature Selection and Classification Techniques for Automatic Genre Classification of Traditional Malay Music," in ISMIR, 2008: Philadelphia, PA, pp. 331-336.
[37]         S. Mahdavifar, A. F. A. Kadir, R. Fatemi, D. Alhadidi, and A. A. Ghorbani, "Dynamic android malware category classification using semi-supervised deep learning," in 2020 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), 2020: IEEE, pp. 515-522.
[38]         S. Mahdavifar, D. Alhadidi, and A. A. Ghorbani, "Effective and efficient hybrid android malware classification using pseudo-label stacked auto-encoder," Journal of network and systems management, vol. 30, no. 1, p. 22, 2022.
[39]         K. LIU, et al. A review of android malware detection approaches based on machine learning. IEEE Access, 2020, 8: 124579-124607.
[40]         S. Bhagwat and G. P. Gupta, "Android malware detection using hybrid meta-heuristic feature selection and ensemble learning techniques," in International Conference on Advances in Computing and Data Sciences, 2022: Springer, pp. 145-156.
[41]         G. Padmavathi, D. Shanmugapriya, and A. Roshni, "Performance analysis of unsupervised machine learning methods for mobile malware detection," in 2022 9th International Conference on Computing for Sustainable Global Development (INDIACom), 2022: IEEE, pp. 201-206.
[42]         R. Yumlembam, B. Issac, S. M. Jacob, and L. Yang, "Iot-based android malware detection using graph neural network with adversarial defense," IEEE Internet of Things Journal, 2022.
[43]         İ. Atacak, K. Kılıç, and İ. A. Doğru, "Android malware detection using hybrid ANFIS architecture with low computational cost convolutional layers," PeerJ Computer Science, vol. 8, p. e1092, 2022.
[44]         S. Tripathy, N. Singh, and D. N. Singh, "ADAM: Automatic Detection of Android Malware," in International Conference on Information Technology and Communications Security, 2021: Springer, pp. 18-31.
[45]         S. S. Shafin, M. M. Ahmed, M. A. Pranto, and A. Chowdhury, "Detection of android malware using tree-based ensemble stacking model," in 2021 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE), 2021: IEEE, pp. 1-6.
[46]         A. Droos, A. Al-Mahadeen, T. Al-Harasis, R. Al-Attar, and M. Ababneh, "Android Malware Detection Using Machine Learning," in 2022 13th International Conference on Information and Communication Systems (ICICS), 2022: IEEE, pp. 36-41.