Evaluating Deep Learning Models for Test Data Generation In File Based Fuzzers

Document Type : Original Article

Authors

1 PhD student, Imam Hossein University (AS), Tehran, Iran

2 Assistant Professor, Imam Hossein University (AS), Tehran, Iran

Abstract

Fuzzing means repeatedly running the program being tested, by modified inputs, with the aim of finding its vulnerabilities. If the program has a complex input structure, generating modified inputs for fuzzing is not an easy task. The best solution in such cases is to use the input structure of the program under test to produce accurate test data. The problem is that the input structure documentation of program under test may not be available. Human understanding of such complex structures is also hard to achieve, costly, time consuming, and prone to errors. To overcome to above problems, this research proposes the use of machine learning and deep neural networks, which automatically learn the complex structures of program inputs and generate test data tailored to this structure. One of main challenges in this field is choosing the appropriate deep learning model which suits the intended application. In this paper, suitable deep learning models for learning and test data generation in file-based fuzzers are studied. Also, the evaluation is performed by introducing and applying the appropriate performance evaluation parameters. So the recurrent neural network and its derivations are introduced as the best deep learning models for text data. Also, effective parameters considered for performance evaluation include the training time, loss value in training and evaluation time. The loss value as the main parameter is evaluated once in various deep learning models with same structure and again in the same deep learning models with various structures and the best deep learning model is selected and proposed.

Keywords


Smiley face

[1] B. Miller and L. Fredriksen, "An Empirical Study of the Reliability of Unix Utilities," Communication of ACM, vol. 33, no. 12, pp. 32-44, 1990. 
[2] B. Miller, D. Koski, C. Pheow, L. Maganty, R. Murthy, A. Natarjan, and J. Steidl, "Fuzz Revisited: A Re-examination of the Reliability of UNIX Utilities and Services," University of Wisconsin, Madison, 1995.
[3] P. Gogefroid, "From Blackbox Fuzzing to Whitebox Fuzzing Towards Verification," 2010. [Online]. Available: http://selab.fbk.eu/issta2010/download/slides/Godefroid-Keynote-ISSTA2010.pdf.
[4] P. Godefroid, M. Levin, and D. Molnar, "Automated Whitebox Fuzz Testing," In Proceedings of the Network and Distributed System Security Symposium, 2008. 
[5] T. Iqbal and S. Qureshi, "The Survey: Text Generation Models in Deep Learning," Journal of King Saud University-Computer and Information Sciences (Production and hosting by Elsevier), vol. 34, no. 6, pp. 2515-2528,[M1]  2020. 
[6] O. Bastani, A. Aiken, P. Liang, and R. Sharma, "Synthesizing Program Input Grammars," ACM SIGPLAN Notices - PLDI '17 , vol. 52, no. 6, pp. 95-110, 2017. 
[7] P. Godefroid, H. Peleg, and R. Singh, "Learn&Fuzz: Machine Learning for Input Fuzzing," ASE (Automated Software Engineering), 2017. 
[8] J. Wang, B. Chent, L. Wei, and Y. Liu, "Skyfire: Data-Driven Seed Generation for Fuzzing," 2017 IEEE Symposium on Security and Privacy (SP), 2017. 
[9] C. Paduraru and M. C. Melemciuc, "An Automatic Test Data Generation Tool Using Machine Learning," Proceedings of the 13th International Conference on Software Technologies (ICSOFT 2018), pp. 472-481, 2018. 
[10] D. She, K. Pei, D. Epstein, J. Yang, and B. Ray, "NEUZZ: Efficient Fuzzing with Neural Program Smoothing," IEEE Symposium on Security and Privacy, vol. 89, no. 49, pp. 38-53, 2019. 
[11] Elvis, "Deep Learning for NLP: An Overview of Recent Trends," dair.ir, 24 08 2018. [Online]. Available: https://medium.com/dair-ai/deep-learning-for-nlp-an-overview-of-recent-trends-d0d8f40a776d. [Accessed 22 02 2021].
[12] P. Kawthekar, R. Rewari, and S. Bhooshan, "Evaluating Generative Models for Text Generation," arXiv, 2017. 
[13] O. Cífka, A. Severyn, E. Alfonseca, and K. Filippova, "Eval All, Trust a Few, do Wrong to None: Comparing Sentence Generation Models," arXiv:1804.07972, 2017. 
[14] Y. Zhu, S. Lu, L. Zheng, J. Guo, W. Zhang, J. Wang, and Y. Yu, "Texygen: A Benchmarking Platform for Text Generation Models," arXiv:1802.01886v1, 2018. 
[15] E. Montahaei, D. Alihosseini, and M. Soleymani Baghshah, "Jointly Measuring Diversity and Quality in Text Generation Models," In Proceedings of the Workshop on Methods for Optimizing and Evaluating Neural Language Generation (NeuralGen), Minneapolis, Minnesota, USA, 2019. 
[16] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning, Massachusett: MIT Press, 2016. 
[17] D. Shiffman, S. Fry, and Z. Marsh, The Nature of Code, Daniel Shiffman, 2012. 
[18] A. Biswal, "Top 10 Deep Learning Algorithms You Should Know in 2021," Simplilearn, 16 2 2021. [Online]. Available: https://www.simplilearn.com/tutorials/deep-learning-tutorial/deep-learning-algorithm. [Accessed 3 3 2021].
[19] K. Cho, B. Van Merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, "Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation," arXiv:1406.1078, 2014. 
[20] M. Hong, M. Wang, L. Luo, X. Tan, D. Zhang, and Y. Lao, "Combining Gated Recurrent Unit and Attention Pooling for Sentimental Classification," In Proceedings of the 2018 2Nd International Conference on Computer Science and Artificial Intelligence, ser. CSAI ’18, New York, NY, USA, 2018. 
[21] S. Mangal, P. Joshi, and R. Modak, "LSTM vs. GRU vs. Bidirectional RNN for Script Generation," arXiv:1908.04332, 2019. 
[22] L. Yu, W. Zhang, J. Wang, and Y. Yu, "SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient," Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31, no. 1, pp. 2852-2858, 2017. 
[23] K. Lin, D. Li, X. He, M.-T. Sun, and Z. Zhang, "Adversarial Ranking for Language Generation," In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems, Long Beach, CA, USA, 2017. 
[24] J. Guo, S. Lu, H. Cai, W. Zhang, Y. Yu, and J. Wang, " Long Text Generation via Adversarial Training with Leaked Information," In Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, Louisiana, USA, 2018. 
[25] T. Che, Y. Li, R. Zhang, R. Devon Hjelm, W. Li, Y. Song, and Y. Bengio, "Maximum-Likelihood Augmented Discrete Generative Adversarial Networks," arXiv:1702.07983, 2017. 
[26] Y. Zhang, Z. Gan, K. Fan, Z. Chen, R. Henao, D. Shen, and L. Carin, "Adversarial Feature Matching for Text Generation," In Proceedings of the 34th International Conference on Machine Learning, Sydney, NSW, Australia, 2017. 
[27] D. Ceara, M. Potet, G. Ensimag, and L. Mounier, "Detecting Software Vulnerabilities Static Taint Analysis Potet," University Politehnica Bucuresti and University Joseph Fourie, 2009. 
[28] V. Manes, H. Han, C. Han, S. Cha, and M. Egele, "The Art, Science, and Engineering of Fuzzing: A Survey," ACM Computing Surveys, 2019. 
[29] S. Bengio, O. Vinyals, N. Jaitly, and N. Shazeer, "Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks," arXiv:1506.03099, vol. 3, 2015. 
[30] M. Caccia, L. Caccia, W. Fedus, H. Larochelle, J. Pineau, and L. Charlin, "Language GANs Falling Short," arXiv:1811.02549, 2020. 
[31] O. Bastani, A. Aiken, P. Liang, and R. Sharma, "Synthesizing Program Input Grammars," ACM SIGPLAN Notices - PLDI '17 , vol. 52, no. 6, pp. 95-110, 2017.
[32] P. Godefroid, H. Peleg, and R. Singh, "Learn&Fuzz: Machine Learning for Input Fuzzing," ASE (Automated Software Engineering), 2017.
[33] J. Wang, B. Chent, L. Wei, and Y. Liu, "Skyfire: Data-Driven Seed Generation for Fuzzing," 2017 IEEE Symposium on Security and Privacy (SP), 2017.
[34] C. Paduraru and M. C. Melemciuc, "An Automatic Test Data Generation Tool using Machine Learning," Proceedings of the 13th International Conference on Software Technologies (ICSOFT 2018), pp. 472-481, 2018.
[35] R. Fan and Y. Chang, "Machine Learning for Black-Box Fuzzing of Network Protocols," Information and Communications Security (ICICS 2017), pp. 621-632, 2018.
[36] Z. Hu, J. Shi, Y. Huang, J. Xiong and X. Bu, "GANFuzz: A GAN-based Industrial Network Protocol Fuzzing Framework," In Proceedings of the 15th ACM International Conference on Computing Frontiers (CF18) pp. 138-145, Ischia, Italy, 2018.
[37] M. Zakeri, S. Parsa, and A. Kalaee, "Format-aware Learn&Fuzz: Deep Test Data Generation for Efficient Fuzzing," arXiv Prepr arXiv181209961, 2019. 
[38] T. Taghavi and M. Bagheri, "Presenting an Intelligence Test Data Generation Method to Discover Software Vulnerabilities," Advanced Defence Science& Technology, vol. 4, no. 37, [M2] pp. 307-322, 2019 (In Persian).
[39] "Pickle — Python Object Serialization," [Online]. Available: https://docs.python.org/3/library/pickle.html. [Accessed 18 04 2021].
[40] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, "Dropout: A Simple Way to Prevent Neural Networks from Overfitting," Journal of Machine Learning Research, vol. 15, pp. 1929-1958, 2014. 
[41] G. Huang, Z. Liu, L. van der Maaten, and K. Q. Weinberger, "Densely Connected Convolutional Networks," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4700-4708, 2017. 
[42] A. Karpathy, Connecting Images and Natural Language, Stanford University, 2016. 
[43] D. P. Kingma and J. Ba, "Adam: A Method for Stochastic Optimization," arXiv:1412.6980, 2017.
  • Receive Date: 11 May 2021
  • Revise Date: 04 December 2021
  • Accept Date: 09 August 2022
  • Publish Date: 23 September 2022