Malware detection using federated learning and incremental learning

Document Type : Original Article

Authors

1 Master's student, Boali Sina Hamedan University, Hamedan, Iran

2 Assistant Professor, Boali Sina University, Hamedan, Hamedan, Iran

3 Associate Professor, Boali Sina University, Hamedan, Hamedan, Iran

Abstract

Android-based mobile devices are widely used due to their ease of use among users. Individuals perform various tasks on their mobile phones, such as banking activities, social networking, and diverse business systems, thereby exposing considerable personal information to risks due to the vulnerabilities of the Android operating system. The rapid development of Android malware has rendered many traditional malware detection methods less accurate over time. Research indicates that machine learning is an effective approach for detecting malware. The rapid evolution of malware contributes to the degradation of accuracy in trained models over time. Moreover, the collection of malware-related data from Android devices jeopardizes users' privacy. To address these issue, this paper employs federated and incremental learning. Recently, federated learning has been introduced for training machine learning models on decentralized devices with the aim of preserving privacy. This study utilizes a Multi-Layer Perceptron (MLP) within the framework of federated learning. Stacking, a type of ensemble learning, is employed for incremental learning. The CICMalDroid 2020 dataset is utilized in this research, using static data to develop the final model. The outcome of this study is a model with an accuracy of 96.49%, demonstrating significant improvement in computational time complexity along with maintaining the quality of learning and model accuracy compared to existing methods.

Keywords

Main Subjects


Smiley face

 

[1]           R. Taheri, M. Shojafar, M. Alazab, and R. Tafazolli, "FED-IIoT: A robust federated malware detection architecture in industrial IoT," IEEE transactions on industrial informatics, vol. 17, no. 12, pp. 8442-8452, 2020.
[2]           K. Liu, S. Xu, G. Xu, M. Zhang, D. Sun, and H. Liu, "A review of android malware detection approaches based on machine learning," IEEE Access, vol. 8, pp. 124579-124607, 2020.
[3]           S. Wu, P. Wang, X. Li, and Y. Zhang, "Effective detection of android malware based on the usage of data flow APIs and machine learning," Information and software technology, vol. 75, pp. 17-25, 2016.
[4]           R. Jin and B. Wang, "Malware detection for mobile devices using software-defined networking," in 2013 Second GENI research and educational experiment workshop, 2013: IEEE, pp. 81-88.
[5]           A. Wang, R. Liang, X. Liu, Y. Zhang, K. Chen, and J. Li, "An inside look at IoT malware," in Industrial IoT Technologies and Applications: Second EAI International Conference, Industrial IoT 2017, Wuhu, China, March 25–26, 2017, Proceedings 2, 2017: Springer, pp. 176-186.
[6]           P. Dahiya, "Malware detection in IoT," in Internet of Things: Security and Privacy in Cyberspace: Springer, 2022, pp. 133-164.
[7]           S. Qing, "Research progress on Android security," Journal of Software, vol. 27, no. 1, pp. 45-71, 2016.
[8]           M. T. Ahvanooey, Q. Li, M. Rabbani, and A. R. Rajput, "A survey on smartphones security: software vulnerabilities, malware, and attacks," arXiv preprint arXiv:2001.09406, 2020.
[9]           A. Souri and R. Hosseini, "A state-of-the-art survey of malware detection approaches using data mining techniques," Human-centric Computing and Information Sciences, vol. 8, no. 1, pp. 1-22, 2018.
[10]         D. E. García, N. DeCastro-García, and A. L. M. Castañeda, "An effectiveness analysis of transfer learning for the concept drift problem in malware detection," Expert Systems with Applications, vol. 212, p. 118724, 2023.
[11]         D. Arp, M. Spreitzenbarth, M. Hubner, H. Gascon, K. Rieck, and C. Siemens, "Drebin: Effective and explainable detection of android malware in your pocket," in Ndss, 2014, vol. 14, pp. 23-26.
[12]         Y. Zhou and X. Jiang, "Dissecting android malware: Characterization and evolution," in 2012 IEEE symposium on security and privacy, 2012: IEEE, pp. 95-109.
[13]         Y. Mirsky, A. Shabtai, L. Rokach, B. Shapira, and Y. Elovici, "Sherlock vs moriarty: A smartphone dataset for cybersecurity research," in Proceedings of the 2016 ACM workshop on Artificial intelligence and security, 2016, pp. 1-12.
[14]         C. I. f. Cybersecurity. "CICMalDroid 2020." https://www.unb.ca/cic/datasets/maldroid-2020.html (accessed 12/23/2023, 2023).
[15]         D. Curry. "Android statistics (2023)." https://www.businessofapps.com/data/android-statistics (accessed 12/7/2023, 2023).
[16]         P. Faruki et al., "Android security: a survey of issues, malware penetration, and defenses," IEEE communications surveys & tutorials, vol. 17, no. 2, pp. 998-1022, 2014.
[17]         J. Gama, I. Žliobaitė, A. Bifet, M. Pechenizkiy, and A. Bouchachia, "A survey on concept drift adaptation," ACM computing surveys (CSUR), vol. 46, no. 4, pp. 1-37, 2014.
[18]         R. Pari, M. Sandhya, and S. Sankar, "A multi-tier stacked ensemble algorithm to reduce the regret of incremental learning for streaming data," IEEE Access, vol. 6, pp. 48726-48739, 2018.
[19]         L. U. Memon, N. Z. Bawany, and J. A. Shamsi, "A comparison of machine learning techniques for android malware detection using apache spark," Journal of Engineering Science and Technology, vol. 14, no. 3, pp. 1572-1586, 2019.
[20]         M. Kumar, "Scalable malware detection system using big data and distributed machine learning approach," Soft Computing, vol. 26, no. 8, pp. 3987-4003, 2022.
[21]         A. Joshi and S. Kumar, "Stacking-based ensemble model for malware detection in android devices," International Journal of Information Technology, vol. 15, no. 6, pp. 2907-2915, 2023.
[22]         U. S. Jannat, S. M. Hasnayeen, M. K. B. Shuhan, and M. S. Ferdous, "Analysis and detection of malware in Android applications using machine learning," in 2019 International Conference on Electrical, Computer and Communication Engineering (ECCE), 2019: IEEE, pp. 1-7.
[23]         "Static Analysis of Malware and Benign apps 2017." https://www.kaggle.com/goorax/datasets (accessed 3/9/2024, 2024).
[24]         M. Chen, Q. Zhou, K. Wang, and Z. Zeng, "An Android Malware Detection Method Using Deep Learning based on Multi-features," in 2022 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA), 2022: IEEE, pp. 187-190.
[25]         M. Schuster and K. K. Paliwal, "Bidirectional recurrent neural networks," IEEE transactions on Signal Processing, vol. 45, no. 11, pp. 2673-2681, 1997.
[28]         A. Ezzatneshan, T. F. S. Kamel, and R. Ghaemi, "Presentation of a New Solution to Botnet Detection in a Markov Chain-Based Network," 2021.
[29]         M. Mosleh and M. Karami, "Presenting a Malware Detection System by Implementing Hardware Counters Based on the Multi-Layer Perceptron Neural Network (MLP) and the Dragonfly Optimization Algorithm," 2021.
[30]         R.-H. Hsu et al., "A privacy-preserving federated learning system for android malware detection based on edge computing," in 2020 15th Asia Joint Conference on Information Security (AsiaJCIS), 2020: IEEE, pp. 128-136.
[31]         R. Gálvez, V. Moonsamy, and C. Diaz, "Less is More: A privacy-respecting Android malware classifier using federated learning," arXiv preprint arXiv:2007.08319, 2020.
[35]         S. Mahdavifar, D. Alhadidi, and A. A. Ghorbani, "Effective and efficient hybrid android malware classification using pseudo-label stacked auto-encoder," Journal of network and systems management, vol. 30, pp. 1-34, 2022.
[36]         S. Doraisamy, S. Golzari, N. Mohd, M. N. Sulaiman, and N. I. Udzir, "A Study on Feature Selection and Classification Techniques for Automatic Genre Classification of Traditional Malay Music," in ISMIR, 2008: Philadelphia, PA, pp. 331-336.
[37]         S. Mahdavifar, A. F. A. Kadir, R. Fatemi, D. Alhadidi, and A. A. Ghorbani, "Dynamic android malware category classification using semi-supervised deep learning," in 2020 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), 2020: IEEE, pp. 515-522.
[38]         S. Mahdavifar, D. Alhadidi, and A. A. Ghorbani, "Effective and efficient hybrid android malware classification using pseudo-label stacked auto-encoder," Journal of network and systems management, vol. 30, no. 1, p. 22, 2022.
[39]         K. LIU, et al. A review of android malware detection approaches based on machine learning. IEEE Access, 2020, 8: 124579-124607.
[40]         S. Bhagwat and G. P. Gupta, "Android malware detection using hybrid meta-heuristic feature selection and ensemble learning techniques," in International Conference on Advances in Computing and Data Sciences, 2022: Springer, pp. 145-156.
[41]         G. Padmavathi, D. Shanmugapriya, and A. Roshni, "Performance analysis of unsupervised machine learning methods for mobile malware detection," in 2022 9th International Conference on Computing for Sustainable Global Development (INDIACom), 2022: IEEE, pp. 201-206.
[42]         R. Yumlembam, B. Issac, S. M. Jacob, and L. Yang, "Iot-based android malware detection using graph neural network with adversarial defense," IEEE Internet of Things Journal, 2022.
[43]         İ. Atacak, K. Kılıç, and İ. A. Doğru, "Android malware detection using hybrid ANFIS architecture with low computational cost convolutional layers," PeerJ Computer Science, vol. 8, p. e1092, 2022.
[44]         S. Tripathy, N. Singh, and D. N. Singh, "ADAM: Automatic Detection of Android Malware," in International Conference on Information Technology and Communications Security, 2021: Springer, pp. 18-31.
[45]         S. S. Shafin, M. M. Ahmed, M. A. Pranto, and A. Chowdhury, "Detection of android malware using tree-based ensemble stacking model," in 2021 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE), 2021: IEEE, pp. 1-6.
[46]         A. Droos, A. Al-Mahadeen, T. Al-Harasis, R. Al-Attar, and M. Ababneh, "Android Malware Detection Using Machine Learning," in 2022 13th International Conference on Information and Communication Systems (ICICS), 2022: IEEE, pp. 36-41.
Volume 13, Issue 1 - Serial Number 49
Spring
April 2025
Pages 117-130
  • Receive Date: 02 December 2024
  • Revise Date: 24 January 2025
  • Accept Date: 03 March 2025
  • Publish Date: 21 April 2025