Automatic XSS Exploit Generation Using Grammatical Evolution

Document Type : Original Article

Authors

Assistant Professor, Computer Group, Imam Hussein Comprehensive University, Tehran, Iran

Abstract

Fuzzers can reveal vulnerabilities in the software by generating test input data and feeding inputs to software under test. The approach of grammar-based fuzzers is to search in the domain of test data which can be generated by grammar in order to find an attack vector with the ability to exploit the vulnerability. The challenge of fuzzers is a very large or infinite search domain and finding the answer in this domain is a hard problem. Grammatical Evolution(GE) is one of the evolutionary algorithms that can utilize grammar to solve the search problem. In this research, a new approach for generation of fuzz test input data by using grammatical evolution is introduced to exploit the cross-site scripting vulnerabilities. For this purpose, a grammar for generating of XSS attack vectors is presented and a fitness calculation function is proposed to guide the GE in search for exploitation. This method has realized the automatic exploitation of vulnerability with black-box approach. In the results of this research, 19% improvement achieved in the number of vulnerabilities discovered compared to the white-box method of NAVEX and black-box ZAP tool, and without any false positives.

Keywords


[1]     OWASP-2017-Top-10, 2017. Available: https://www.owasp.org/images/7/72/OWASP_Top_10-2017_(en).pdf##
[2]             A. Avancini and M. Ceccato, ”Comparison and integration of genetic algorithms and dynamic symbolic execution for security testing of cross-site scripting vulnerabilities," Information and Software Technology, vol. 55, no. 12,  pp. 2209-2222, 2013.##
[3]             X. Guo, S. Jin, and Y. Zhang, “XSS Vulnerability Detection Using Optimized Attack Vector Repertory,” in 2015 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, pp. 29-36, 2015.##
[4]             F. Duchene, R. Groz, S. Rawat, and J.-L. Richier, “XSS Vulnerability Detection Using Model Inference Assisted Evolutionary Fuzzing,” in SECTEST 2012 - 3rd International Workshop on Security Testing (affiliated with ICST), Montreal, Canada, pp. 815-817, 2012.##
[5]             J. Yang and Q. Tang, “RTF Editor XSS Fuzz Framework,” in International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing, Springer, Cham, pp.    941-951, 2017.##
[6]     J. Kronjee, A. Hommersom, and H. Vranken, “Discovering software vulnerabilities using data-flow analysis and machine learning,” in Proceedings of the 13th International Conference on Availability, Reliability and Security, p. 6, 2018.##
[7]     M. E. Ruse and S. Basu, “Detecting cross-site scripting vulnerability using concolic testing,” in Information Technology: New Generations (ITNG), 2013 Tenth International Conference on, pp. 633-638, 2013.##
[8]     M. A. Ahmed and F. Ali, “Multiple-path testing for cross site scripting using genetic algorithms,” Journal of Systems Architecture, vol. 64,  pp. 50-62, 2016.##
[9]     A. W. Marashdih, Z. F. Zaaba, and H. K. Omer, “Web Security: Detection of Cross Site Scripting in PHP Web Application using Genetic Algorithm,” International Journal of Advanced Computer Science and Applications, vol. 8, no. 5,  pp. 64-75, 2017.##
[10]  R. Wang, G. Xu, X. Zeng, X. Li, and Z. Feng, “TT-XSS: A novel taint tracking based dynamic detection framework for DOM Cross-Site Scripting,” Journal of Parallel and Distributed Computing, 2017.##
[11]  S. Kals, E. Kirda, C. Kruegel, and N. Jovanovic, “Secubat: a web vulnerability scanner,” in Proceedings of the 15th international conference on world wide web, pp. 247-256, 2006.##
[12]  A. Kieyzun, P. J. Guo, K. Jayaraman, and M. D. Ernst, “Automatic creation of SQL injection and cross-site scripting attacks,” in Software Engineering, 2009. ICSE 2009. IEEE 31st International Conference on, pp. 199-209, 2009.##
[13]  H. Homaei and H. R. Shahriari, “Athena: A framework to automatically generate security test oracle via extracting policies from source code and intended software behaviour,” Information and Software Technology, vol. 107, pp. 112-124, 2019.##
[14]          A. Alhuzali, R. Gjomemo, B. Eshete, and V. Venkatakrishnan, “NAVEX: Precise and Scalable Exploit Generation for Dynamic Web Applications,” in 27th USENIX Security Symposium (USENIX Security 18), pp. 377-392, 2018.##
[15]  H.-Y. Shih, H.-L. Lu, C.-C. Yeh, H.-C. Hsiao, and S.-K. Huang, “A Generic Web Application Testing and Attack Data Generation Method,” in International Conference on Security with Intelligent Computing and Big-data Services, pp.       232-247, 2017.##
[16]  A. Alhuzali, B. Eshete, R. Gjomemo, and V. Venkatakrishnan, “Chainsaw: Chained automated workflow-based exploit generation,” in Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 641-652, 2016.##
[17]  Y. Li, B. Chen, M. Chandramohan, S.-W. Lin, Y. Liu, and A. Tiu, “Steelix: program-state based binary fuzzing,” presented at the Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, Paderborn, Germany, 2017.##
[18]  H. C. Huang, Z. K. Zhang, H. W. Cheng, and S. W. Shieh, “Web Application Security: Threats, Countermeasures, and Pitfalls,” Computer, vol. 50, no. 6,  pp. 81-85, 2017.##
[19]          M. Mohammadi, B. Chu, and H. R. Lipford, “Detecting Cross-Site Scripting Vulnerabilities through Automated Unit Testing,” in 2017 IEEE International Conference on Software Quality, Reliability and Security (QRS), pp. 364-373, 2017.##
[20]  M. Mohammadi, B. Chu, H. R. Lipford, and E. Murphy-Hill, “Automatic Web Security Unit Testing: XSS Vulnerability Detection,” in 2016 IEEE/ACM 11th International Workshop in Automation of Software Test (AST), pp. 78-84, 2016.##
[21]  T. Zhushou, Z. Haojin, C. Zhenfu, and Z. Shuai, “L-WMxD: Lexical based Webmail XSS Discoverer,” in 2011 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pp. 976-981, 2011.##
[22]  Y. H. Wang, C. H. Mao, and H. M. Lee, “Structural learning of attack vectors for generating mutated xss attacks,” arXiv preprint arXiv:1009.3711, 2010.##
[23]  V. Atlidakis, R. Geambasu, P. Godefroid, M. Polishchuk, and B. Ray, “Pythia: Grammar-Based Fuzzing of REST APIs with Coverage-guided Feedback and Learning-based Mutations,” arXiv preprint arXiv:2005.11498, 2020.##
[24]  M. Eberlein, Y. Noller, T. Vogel, and L. Grunske, “Evolutionary Grammar-Based Fuzzing,” in International Symposium on Search Based Software Engineering, pp.     105-120, 2020.##
[25]  M. Z. Nasrabadi and S. Parsa, “Automatic Test Data Generation in File Format Fuzzers,” Journal of Electronical and Cyber Defence, vol. 8, no. 1, 2020.##
[26]  O. Caño Bellatriu, “Penetration testing automation system,” Barcelona School of Informatic, 2014.##
[27]  C. Ryan, M. O’Neill, and J. Collins, “Grammatical evolution: Solving trigonometric identities,” in proceedings of Mendel, p. 4th, 1998.##
[28]  M. O'Neill and C. Ryan, “Grammatical evolution,” Trans. Evol. Comp, vol. 5, no. 4,  pp. 349-358, 2001.##
[29]  C. Ryan, “Grammatical evolution tutorial,” presented at the Proceedings of the 12th annual conference companion on Genetic and evolutionary computation, Portland, Oregon, USA, 2010.##
[30]  M. Fenton, J. McDermott, D. Fagan, S. Forstenlechner, E. Hemberg, and M. O'Neill, “PonyGE2: Grammatical evolution in python,” in Proceedings of the Genetic and Evolutionary Computation Conference Companion, pp. 1194-1201, 2017.##
[31]  A. Sołtysik-Piorunkiewicz and M. Krysiak, “The Cyber Threats Analysis for Web Applications Security in Industry 4.0,” in Towards Industry 4.0—Current Challenges in Information Systems, ed: Springer, pp. 127-141, 2020.##
[32]  N. H. Nguyen, V. H. Le, V. O. Phung, and P. H. Du, “Toward a Deep Learning Approach for Detecting PHP Webshell,” in Proceedings of the Tenth International Symposium on Information and Communication Technology, pp. 514-521, 2019.##
[33]  C. Lv, L. Zhang, F. Zeng, and J. Zhang, “Adaptive Random Testing for XSS Vulnerability,” in 2019 26th Asia-Pacific Software Engineering Conference (APSEC), pp. 63-69, 2019.##
[34]  C. G. Nevill-Manning and I. H. Witten, “Identifying hierarchical structure in sequences: A linear-time algorithm,” Journal of Artificial Intelligence Research, vol. 7, pp. 67-82, 1997.##
[35]  C. D. L. Higuera, “Grammatical Inference Learning Automata and Grammars,” Cambridge University Press, 2010.##
[36]  A. Avancini and M. Ceccato, “Circe: A grammar-based oracle for testing cross-site scripting in web applications,” in 2013 20th Working Conference on Reverse Engineering (WCRE), pp. 262-271, 2013.##
[37]  J. Hunt and T. Szymanski, “A fast algorithm for computing longest common subsequences,” Communications of the ACM, vol. 20, no. 5, pp. 350-353, 1977.##
[38]   S. Bennetts, “Owasp zed attack proxy,” App. Sec. USA, 2013.##
Volume 9, Issue 2 - Serial Number 34
Serial No. 34, Summer Quarterly
June 2021
Pages 101-119
  • Receive Date: 07 September 2020
  • Revise Date: 04 October 2020
  • Accept Date: 26 October 2020
  • Publish Date: 22 June 2021