Malicious Domain Detection using DNS Records

Document Type : Original Article

Authors

1 Assistant Professor, Faculty of Computer Engineering, Shahroud University of Technology, Shahroud, Iran

2 Master student, Faculty of Computer Engineering, Shahroud University of Technology, Shahroud, Iran

3 Assistant Professor, Shahroud University of Technology, Shahroud, Iran

Abstract

One of the most important security challenges with the advance of technology in cyberspace is phishing attacks. Phishing is a type of cyber-attack that always tries to obtain information such as username, password, bank account information, and the like by forging a website, email address and convincing the user to enter this information. Due to the increasing growth of these attacks and the increasing complexity of the type of attack, current phishing detection systems often cannot adapt to new attacks and have low detection accuracy. Graph-based methods are one of the techniques for identifying malicious domains that use the connections between the domain and IP to identify. In this paper, a graph-based phishing detection system using deep learning is presented. The main steps in the proposed method include extracting IP from the domain, defining the relationship between the domains, determining the weights, and converting the data to a vector by the Node2vec algorithm. Then, using CNN and DENSE deep learning models, the classification and identification operations are performed. The experimental results over three different datasets show that the proposed method provides an accuracy of about 99% in identifying malicious domains, which has an acceptable improvement compared to state of the art in this context.

Keywords


[1]   P. Gopinath, S. Sangeetha, B. Rajendran, S. Goyal, and B. S. Bindhumadhava, “Malicious Domain Detection Using Machine Learning On Domain Name Features, Host-Based Features and Web-Based Features.” Procedia Computer Science, vol. 171, pp. 654-661, 2020.##
[2]   S. Soholian, “E-Commerce in the Oil and Gas Industry,” Master Thesis, Arak, Mahallat Branch, Islamic Azad University, 2015. (InPersion)##
[3]   N, Langari and M Abdolrezzagh-Nezhad, “Phishing Website Detection for e-Banking by Inclined Planes Optimization Algorithm” Journal of Electronical & Cyber Defence, vol. 3, no. 1, pp. 29-40, 2015.(InPersion)##
[4]   L. Dennis and M. Shain, “Dictionary of information technology,” Macmillan International Higher Education, 1988.##
[5]   M. Antonakakis, R. Perdisci, D. Dagon, W. Lee, and N, Feamster, “Building a dynamic reputation system for dns,” In USENIX security symposium, pp. 273-290, 2010.##
[6]    L. Bilge, E. Kirda, C. Kruegel, and M. Balduzzi,“EXPOSURE: Finding Malicious Domains Using Passive DNS Analysis,” In Ndss, pp. 1-17, 2011.##
[7]   B. Rahbarinia, R. Perdisci, and M. Antonakakis,“Efficient and accurate behavior-based tracking of malware-control domains in large ISP networks,” ACM Transactions on Privacy and Security (TOPS), vol. 19, no. 2, pp. 1-13, 2016.##
[8]   S. Smadi, N. Aslam, and Li. Zhang,“Detection of online phishing email using dynamic evolving neural network based on reinforcement learning,” Decision Support Systems,vol. 107,pp. 88-102, 2018.##
[9]   S. Gupta, A. Singhal, and A. Kapoor,“A literature survey on social engineering attacks: Phishing attack,” international conference on computing, communication and automation (ICCCA), IEEE, pp. 537-540, 2016.##
[10]    B. Rajendran and P. Shetty,“Domain Name System (DNS) Security: Attacks Identification and Protection Methods,” Proceedings of the International Conference on Security and Management (SAM), The Steering Committee of the World Congress in Computer Science, Computer Engineering and Applied Computing (WorldComp), pp. 27-33, 2018.##
[11]   C. Y. Tejaswini Yadav, B. Rajendran, and P. Rajani,“An Approach for Determining the Health of the DNS,” Sree Vidyanikethan Engineering College INDIA (IJCSMC), 2014.##
[12]   A. Kountouras, P. Kintis, C. Lever, Y. Chen, Y. Nadji, D. Dagon, M. Antonakakis, and R. Joffe,“Enabling network security through active DNS datasets,” International Symposium on Research in Attacks, Intrusions, and Defenses, pp. 188-208, 2016.##
[13]   M. Antonakakis, R. Perdisci, Y. Nadji, N. Vasiloglou, S. Abu-Nimeh, W. Lee, and D. Dagon,“From throw-away traffic to bots: detecting the rise of DGA-based malware,” In Presented as part of the 21st {USENIX} Security Symposium (USENIX, Security 12), pp. 491-506, 2012.##
[14]   J. Lee, and H. Lee,“GMAD: Graph-based Malware Activity Detection by DNS traffic analysis,” Computer Communications, vol. 49, pp. 33-47, 2014.##
[15]I.Khalil,T.Yu,andB.Guan,“DiscoveringmaliciousdomainsthroughpassiveDNSdatagraphanalysis,
onComputerandCommunicationsSecurity, pp. 663-674, 2016.##
[16]   K. A. Messabi, M. Aldwairi, A. A. Yousif, A. Thoban, and F. Belqasmi,“Malware detection using dns records and domain name features,” In Proceedings of the 2nd International Conference on Future Networks and Distributed Systems, pp. 1-7, 2018.##
[17] T. F. Yen and M. K. Reiter,“Traffic aggregation for malware detection,” International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Springer Berlin, Heidelberg, pp. 207-227, 2008.##
[18] B. Eshete, A. Villafiorita, and K. W. Binspect,“Holistic Analysis and Detection of Malicious Web Pages. Security and Privacy in Communication Networks,” 2012.##
[19] L. Bilge, S. Sen, D. Balzarotti, E. Kirda, and C. Kruegel. “Exposure: A passive DNS analysis service to detect and report malicious domains.” ACM Transactions on Information and System Security (TISSEC), vol. 16, no. 4, pp. 1-28, 2014.##
[20] P. Zhang, Panpan, T. Liu, Y. Zhang, J. Ya, J. Shi, and Y. Wang. “Domain watcher: detecting malicious domains based on local and global textual features.” Procedia Computer Science, vol. 108, pp. 2408-2412, 2017.##
[21] Lei, Kai, Qiuai Fu, Jiake Ni, Feiyang Wang, Min Yang, and Kuai Xu,“Detecting Malicious Domains with Behavioral Modeling and Graph Embedding,” 39th International Conference on Distributed Computing Systems (ICDCS), IEEE, pp. 601-611, 2019.##
[22] D. Chiba, T. Yagi, M. Akiyama, T. Shibahara, T. Yada, T. Mori, and S. Goto,“DomainProfiler: Discovering domain names abused in future,” In 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), IEEE, pp. 491-502, 2016.##
[23] Y. Shi, G. Chen, and J. Li,“Malicious domain name detection based on extreme machine learning,” Neural Processing Letters, vol. 48, no. 3, pp. 1347-1357, 2018.##
[24] H. Akbari and M. Bagheri, “Improving the ‌detection ‌method of fake website ‌based on genetic algorithms and machine learning,” Master. Thesis, Imam Hossein Comprehensive Univ, Nov. 2019.##
[25] H. A. Song and S. Y. Lee,“Hierarchical Representation Using NMF,” In International conference on Neural Information Processing, Berlin, Heidelberg, pp. 466-473, 2013.##
[26] J. Ahmad, H. Farman, and Z. Jan,“Deep learning methods and applications,” In Deep Learning: Convergence to Big Data Analytics. Springer, Singapore, pp. 31-42, 2019.##
[27] D. Li and D. Yu,“Deep learning: methods and applications,” Foundations and trends in signal processing, vol. 7, no. 3–4, pp. 197-387, 2014.##
[28]  A. Kamilaris and F. X. Prenafeta-Boldú,“Deep learning in agriculture: A survey,” Computers and electronics in agriculture, vol. 147, pp. 70-90, 2018.##
[29]  R. Vinayakumar, K. P. Soman, and P. Poornachandran,“Detecting malicious domain names using deep learning approaches at scale,” Journal of Intelligent & Fuzzy Systems, vol. 34, no. 3, pp. 1355-1367, 2018.##
[30]  G. S. Josan and J. Kaur,“LSTM Network Based Malicious Domain Name Detection,” International Journal of Engineering and Advanced Technology (IJEAT) ISSN, vol. 8, pp. 2249 –8958, 2019.##
[31]  H. Gao, Hongyu, V. Yegneswaran, J. Jiang, Y. Chen, Ph. Porras, S. Ghosh, and H. Duan,“Reexamining DNS from a global recursive resolver perspective,” IEEE/ACM Transactions on Networking, vol. 24, no. 1, pp. 43-57, 2014.##
[32]  M. Thomas and A. Mohaisen,“Kindred domains: detecting and clustering botnet domains using DNS traffic,” In Proceedings of the 23rd International Conference on World Wide Web, pp. 707-712, 2014.##
[33]  T. S. Wang, H. T. Lin, W. T. Cheng, and C. Y. Chen,“DBod: Clustering and detecting DGA-based botnets
using DNS traffic analysis,” Computers & Security 64, pp. 1-15, 2017.##
[34] N. Jiang, J. Cao, Y. Jin, LiE. Li, and Z. Li. Zhang,“Identifying suspicious activities through dns failure graph analysis,” In The 18th IEEE International Conference on Network Protocols, IEEE, pp. 144-153, 2010.##
[35] P. K. Manadhata, S. Yadav, P. Rao, and W. Horne,“Detecting malicious domains via graph inference,” In European Symposium on Research in Computer Security, Springer, Cham, pp. 1-18, 2014.##
[36] J. S. Yedidia, W. T. Freeman, and Y. Weiss,“Understanding belief propagation and its generalizations,” Exploring artificial intelligence in the new millennium, vol. 8, pp. 236-239, 2003.##
[37] M. A. Jafari Zadeh, F. Ghaffari Joghani, M. Babazadeh, and R. Bayramzadeh,“Quantum Belief Dissemination Algorithm,” Proceedings of the Iranian Physics Conference, University of Tabriz, 2015. (InPersion)##
[38] MS. Leifer, and D. Poulin. “Quantum graphical models and belief propagation.” Annals of Physics, vol. 323, no. 8, pp. 1899-1946, 2008.##
[39]  I. M. Khalil, B. Guan, M. Nabeel, and T. Yu, “A domain is only as good as its buddies: Detecting stealthy malicious domains via graph inference,” In Proceedings of the Eighth ACM Conference on Data and Application Security and Privacy, pp. 330-341, 2018.##
[40]  B. Bollobás,“Modern graph theory,” Springer Science & Business Media, vol. 184, 2013.##
[41]  H. Chen, S. F. Sultan, Y. Tian, M. Chen, and S. Skiena,“Fast and Accurate Network Embeddings via Very Sparse Random Projection,” In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pp. 399-408, 2019.##
[42]  A. Grover and J. Leskovec,“node2vec: Scalable feature learning for networks,” In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 855-846, 2016.##
[43]   Y. Zhou, Z. M. Fadlullah, B. Mao, and N. Kato, “A deep-learning-based radio resource assignment technique for 5G ultra dense networks,” IEEE Network, vol. 32, no. 6, pp. 28-34, 2018.##
[44]  O. Brun, Y. Yin, E. Gelenbe, Y. M. Kadioglu, J. Augusto-Gonzalez, and M. Ramos,“Deep learning with dense random neural networks for detecting attacks against iot-connected home environments,” In International ISCIS Security Workshop, Springer, Cham, pp. 79-89, 2018.##
[45]  I. Zafar, G. Tzanidou, R. Burton, N. Patel, and L. Araujo,“Hands-on Convolutional Neural Networks with TensorFlow: Solve Computer Vision Problems with Modeling in TensorFlow and Python,” Packt Publishing Ltd, 2018.
[46] J. Zhao, X. Mao, and L. Chen,“Speech emotion recognition using deep 1D & 2D CNN LSTM networks,” Biomedical Signal Processing and Control,vol. 47, pp. 312-323, 2019.##
[47] List of Domestic Internet Domains, Information Technology Organization of Iran, 2020. [Online], https://g2b.ito.gov.ir/ index.php/site/page/view/list_ip.##
[48]  Top 10 million domains, Open PageRank Initiative, 2020. [Online], https://www.domcop.com/top-10-million-domains.##
[49]  J. T. Townsend,“Theoretical analysis of an alphabetic confusion matrix,” Perception & Psychophysics, vol. 9, no. 1, pp. 40-50, 1971.##
[50]  D. G. Altman and J. M. Bland,“Diagnostic tests 3: receiver operating   characteristic plots,” BMJ: British Medical Journal, vol. 309, no. 6948, p.188, 1994.##
[51] A. J. Bowers and X. Zhou,“Receiver operating characteristic (ROC) area under the curve (AUC): A diagnostic measure for evaluatingthe accuracy of predictors of education outcomes,” Journal of Education for Students Placed at Risk (JESPAR), vol. 24, no. 1, pp. 20-24, 2019.##
[52] Y. Feng, X. Shen, H. Chen, and X. Zhang,“A weighted-ROC graph based metric for image segmentation evaluation," Signal Processing, vol. 119, pp. 43-55, 2016.##
[53] L. Zhang and N. Hu,“Roc analysis based condition indicator threshold optimization method, ” Prognostics and System Health Management Conference (PHM-Harbin), IEEE, pp. 1-6, 2017.##
 
Volume 9, Issue 3 - Serial Number 35
Serial No. 35, Autumn Quarterly
December 2021
Pages 83-97
  • Receive Date: 22 November 2020
  • Revise Date: 13 January 2021
  • Accept Date: 09 January 2021
  • Publish Date: 22 November 2021