Cyber Threat Information Extraction using Deep Learning and Knowledge Representation

Document Type : Original Article

Authors

1 Assistant Professor, Department of Computer Engineering, Faculty of Engineering, Kosar University of Bojnord, Bojnord, Iran

2 Assistant Professor, Department of Electrical Engineering, Faculty of Electrical and Computer Engineering, Esfarayen University of Technology, Esfarayen, Iran

Abstract

Cyber security information is rapidly growing on the internet and cyber attacks are increasing daily. Attackers mostly target the military, government, and corporate departments, because these contain sensitive and classified information that requires appropriate defense strategies. Cyber threat information extraction, i.e., extracting entities, relationships between them, and events in cyber texts, is one of the important steps for detecting cyber attacks, harmful events, and mitigating them in real time if they occur. Extracting valuable information from cyber threats can help security professionals to make informed decisions and develop strong defense strategies. It is also a fundamental solution for improving the performance of systems such as text summarization, machine translation, and question-answering. Although information extraction has been an active research topic over the past four decades, its accuracy is still not acceptable and there is no accurate computational model for it. In this paper, first, the entities in the text are extracted with high accuracy using the latest vocabulary embedding method, the Bi-GRU bidirectional recurrent network, the attention mechanism, and the knowledge representation; Then, expressions related to the entities are recognized by calculating the importance and weight of each feature and considering all the necessary criteria in decision-making. The entities relationships were extracted by a graph-based neural network and a heuristic loss function. The KVP deep network based on the attention mechanism has been used for accurate detection and security events prediction which can identify the correlation between two elements that have different positions in the input sequence. Extensive simulations have been carried out to check the performance of the proposed method. According to the simulation results, the proposed method has achieved 89.8% and 93.4% F1 scores on CoNLL-2012 and OSINT datasets, respectively.

Keywords

Main Subjects


Smiley face

 

[1]     L. Zongxun, L. Yujun, Z. Haojie, and L. Juan, “Construction of ttps from apt reports using bert,” in 2021 18th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), IEEE, 2021, pp. 260–263. Accessed: Apr. 15, 2024. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9674158
[2]     K. Dadashtabar Ahmadi, M. Kheirkhah, A. J. Rashidi, " Detection of advanced Cyber Attacks, Using Behavior Modeling Based on Natural Language Processing", ECD, Vol. 6, No. 3, Serial No. 2, pp. 141-151, 2018. doi: 20.1001.1.23224347.1397.6.3.12.2 
[3]     N. Sun et al., “Cyber threat intelligence mining for proactive cybersecurity defense: a survey and new perspectives,” IEEE Communications Surveys & Tutorials, 2023, Accessed: Apr. 20, 2024. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/10117505/
[4]     S. Ainslie, D. Thompson, S. Maynard, and A. Ahmad, “Cyber-threat intelligence for security decision-making: a review and research agenda for practice,” Computers & Security, p. 103352, 2023. https://doi.org/10.1016/j.cose.2023.103352.
[5]     M. H. HassanNia, M. R. HasaniAhangar, A. Gafori, An Improved Method of Incident Detection due to Cyber Attacks, ECD, Vol. 7, No. 4, 2020. Available: https://sid.ir/paper/395725/en.
[6]     E.Bastami, H.Soltanizadeh*,M. Rahmanimanesh, P. Keshavarzi, "A Malware Classification Method Using visualization and Word Embedding Features", ECD, Vol. 11, No. 1, 2023. doi: 20.1001.1.23224347.1402.11.1.1.2.
[7]     K. Lee, L. He, and L. Zettlemoyer, “Higher-order coreference resolution with coarse-to-fine inference,” arXiv preprint arXiv:1804.05392, 2018. 
https://doi.org/10.48550/arXiv.1804.05392.
[8]     H. Peng, D. Khashabi, and D. Roth, “Solving hard coreference problems,” arXiv preprint arXiv:1907.05524, 2019. https://doi.org/10.48550/arXiv.1907.05524.
[9]     L.-T. Wu, J.-R. Lin, S. Leng, J.-L. Li, and Z.-Z. Hu, “Rule-based information extraction for mechanical-electrical-plumbing-specific semantic web,” AUTOMAT CONSTR, vol. 135, p. 104108, 2022. https://doi.org/10.1016/j.autcon.2021.104108.
[10]  A. Alamoudi, A. Alomari, and S. Alwarthan, “A rule-based information extraction approach for extracting metadata from PDF books,” ICIC Express Letters, Part B: Applications, vol. 12, no. 2, pp. 121–132, 2021. doi:10.24507/icicelb.12.02.121
[11]  D. Freitag, J. Cadigan, R. Sasseen, and P. Kalmar, “VALET: rule-based information extraction for rapid deployment,” in Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022, pp. 524–533. Accessed: Oct. 26, 2023. [Online]. Available: https://aclanthology.org/2022.lrec-1.55/
[12]  F. Rahma and A. Romadhony, “Rule-Based Crime Information Extraction on Indonesian Digital News,” in 2021 International Conference on Data Science and Its Applications (ICoDSA), IEEE, 2021, pp. 10–15. Accessed: Oct. 26, 2023. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9617509
[13]  K. Shaukat, S. Luo, S. Chen and D. Liu, "Cyber Threat Detection Using Machine Learning Techniques: A Performance Evaluation Perspective," 2020 International Conference on Cyber Warfare and Security (ICCWS), Islamabad, Pakistan, pp. 1-6, 2020. doi: 10.1109/ICCWS48432.2020.9292388.
[14]  B. M. Davis, M. Salinas-Navarro, M. F. Cordeiro, L. Moons, and L. De Groef, “Characterizing microglia activation: a spatial statistics approach to maximize information extraction,” Sci. Rep., vol. 7, no. 1, p. 1576, 2017. https://doi.org/10.1038/s41598-017-01747-8.
[15]  G. Tür, D. Hakkani-Tür, and K. Oflazer, “A statistical information extraction system for Turkish,” Nat. Lang. Eng, vol. 9, no. 2, pp. 181–210, 2003. doi:10.1017/S135132490200284X.
[16]  J. Zhang, “Entropic Statistics: Concept, Estimation, and Application in Machine Learning and Knowledge Extraction,” Mach. learn. knowl. extr., vol. 4, no. 4, pp. 865–887, 2022. https://doi.org/10.3390/make4040044.
[17]  Y. Ghazi, Z. Anwar, R. Mumtaz, S. Saleem, and A. Tahir, “A supervised machine learning based approach for automatically extracting high-level threat intelligence from unstructured sources,” in 2018 International Conference on Frontiers of Information Technology (FIT), IEEE, 2018, pp. 129–134. Accessed: Apr. 22, 2024. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/8616979
[18]  K. Narasimhan, A. Yala, and R. Barzilay, “Improving Information Extraction by Acquiring External Evidence with Reinforcement Learning.” arXiv, Sep. 27, 2016. Accessed: Oct. 20, 2023. [Online]. Available: http://arxiv.org/abs/1603.07954.https://doi.org/10.48550/arXiv.1603.07954
[19]  W. Y. Wang, J. Li, and X. He, “Deep reinforcement learning for NLP,” in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts, 2018, pp. 19–21. Accessed: Oct. 20, 2023. [Online]. Available: https://aclanthology.org/P18-5007/
[20]  X. Wang, J. Yang, Q. Wang, and C. Su, “Threat Intelligence Relationship Extraction Based on Distant Supervision and Reinforcement Learning.,” in SEKE, 2020, pp. 572–576. Accessed: Apr. 22, 2024. [Online]. Available: https://ksiresearch.org/seke/seke20paper/paper149.pdf
[21]  X. Wang et al., “A method for extracting unstructured threat intelligence based on dictionary template and reinforcement learning,” in 2021 IEEE 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD), IEEE, 2021, pp. 262–267. Accessed: Apr. 22, 2024. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9437858
[22]  Y. Yang, Z. Wu, Y. Yang, S. Lian, F. Guo, and Z. Wang, “A survey of information extraction based on deep learning,” Appl. Sci., vol. 12, no. 19, p. 9691, 2022. https://doi.org/10.3390/app12199691.
[23]  H. Jo, Y. Lee, and S. Shin, “Vulcan: Automatic extraction and analysis of cyber threat intelligence from unstructured text,” COMPUT SECUR , vol. 120, p. 102763, 2022. https://doi.org/10.1016/j.cose.2022.102763.
[24]  X. Wang et al., “Cyber threat intelligence entity extraction based on deep learning and field knowledge engineering,” in 2022 IEEE 25th International Conference on Computer Supported Cooperative Work in Design (CSCWD), IEEE, 2022, pp. 406–413. Accessed: Apr. 22, 2024. [Online]. Available:https://ieeexplore.ieee.org/abstract/document/9776139/
[25]  K. Ahmed, S. K. Khurshid, and S. Hina, “CyberEntRel: Joint extraction of cyber entities and relations using deep learning,” COMPUT SECUR , vol. 136, p. 103579, 2024. https://doi.org/10.1016/j.cose.2023.103579.
[26]  Y. Shi, Y. Xiao, P. Quan, M. Lei, and L. Niu, “Document-level relation extraction via graph transformer networks and temporal convolutional networks,” Pattern Recognit. Lett, vol. 149, pp. 150–156, 2021. https://doi.org/10.1016/j.patrec.2021.06.012.
[27]  C. Park, J. Park, and S. Park, “AGCN: Attention-based graph convolutional networks for drug-drug interaction extraction,” Expert Syst. Appl., vol. 159, p. 113538, 2020. https://doi.org/10.1016/j.eswa.2020.113538.
[28]  J. Xu, Y. Chen, Y. Qin, R. Huang, and Q. Zheng, “A feature combination-based graph convolutional neural network model for relation extraction,” Symmetry, vol. 13, no. 8, p. 1458, 2021. https://doi.org/10.3390/sym13081458.
[29]  H. Zhang, Z. Huang, Z. Li, D. Li, and F. Liu, “Densely Connected Graph Attention Network Based on Iterative Path Reasoning for Document-Level Relation Extraction,” in Advances in Knowledge Discovery and Data Mining, vol. 12713, K. Karlapalem, H. Cheng, N. Ramakrishnan, R. K. Agrawal, P. K. Reddy, J. Srivastava, and T. Chakraborty, Eds., LECT NOTES ARTIF INT, vol. 12713. , Cham: Springer International Publishing, 2021, pp. 269–281. doi: 10.1007/978-3-030-75765-6_22.
[30]  S. Guo, L. Huang, G. Yao, Y. Wang, H. Guan, and T. Bai, “Extracting Biomedical Entity Relations using Biological Interaction Knowledge,” INTERDISCIP SCI, vol. 13, no. 2, pp. 312–320, Jun. 2021, doi: 10.1007/s12539-021-00425-8.
[31]  S. Raza and B. Schwartz, “Entity and relation extraction from clinical case reports of COVID-19: a natural language processing approach,” BMC MED INFORM DECIS, vol. 23, no. 1, p. 20, Jan. 2023, doi: 10.1186/s12911-023-02117-3.
[32]  C. Kruengkrai, T. H. Nguyen, S. M. Aljunied, and L. Bing, “Improving low-resource named entity recognition using joint sentence and token labeling,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020, pp. 5898–5905. Accessed: Oct. 26, 2023. [Online]. Available: https://aclanthology.org/2020.acl-main.523/
[33]  P. H. Martins, Z. Marinho, and A. F. T. Martins, “Joint Learning of Named Entity Recognition and Entity Linking.” arXiv, Jul. 18, 2019. Accessed: Oct. 26, 2023. [Online]. Available: http://arxiv.org/abs/1907.08243
[34]  Y. Lu et al., “Unified Structure Generation for Universal Information Extraction.” arXiv, Mar. 23, 2022. Accessed: Oct. 26, 2023. [Online]. Available: http://arxiv.org/abs/2203.12277.
[35]  I.-H. Hsu et al., “DEGREE: A Data-Efficient Generation-Based Event Extraction Model.” arXiv, May 03, 2022. Accessed: Oct. 26, 2023. [Online]. Available: http://arxiv.org/abs/2108.12724.
[36]  J. Gao, H. Zhao, C. Yu, and R. Xu, “Exploring the Feasibility of ChatGPT for Event Extraction.” arXiv, Mar. 09, 2023. Accessed: Oct. 26, 2023. [Online]. Available: http://arxiv.org/abs/2303.03836.
[37]  D. Zhang, S. Wei, S. Li, H. Wu, Q. Zhu, and G. Zhou, “Multi-modal graph fusion for named entity recognition with targeted visual guidance,” in Proceedings of the AAAI conference on artificial intelligence, 2021, pp. 14347–14355. Accessed: Oct. 26, 2023. [Online]. Available: https://ojs.aaai.org/index.php/AAAI/article/view/17687
[38]  D. Sui, Z. Tian, Y. Chen, K. Liu, and J. Zhao, “A large-scale chinese multimodal ner dataset with speech clues,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 2021, pp. 2807–2818. Accessed: Oct. 26, 2023. [Online]. Available: https://aclanthology.org/2021.acl-long.218/
[39]  C. Zheng, J. Feng, Z. Fu, Y. Cai, Q. Li, and T. Wang, “Multimodal Relation Extraction with Efficient Graph Alignment,” in Proceedings of the 29th ACM International Conference on Multimedia, Virtual Event China: ACM, Oct. 2021, pp. 5298–5306. doi: 10.1145/3474085.3476968.
[40]  D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” arXiv preprint arXiv:1409.0473, 2014. https://doi.org/10.48550/arXiv.1409.0473.
[41]  A. T. de Almeida, “Multicriteria decision model for outsourcing contracts selection based on utility function and ELECTRE method,” Comput. Oper. Res., vol. 34, no. 12, pp. 3569–3574, 2007. https://doi.org/10.1016/j.cor.2006.01.003.
[42]  A. Vaswani et al., “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017, Accessed: Jun. 04, 2024. [Online]. Available: https://proceedings.neurips.cc/paper/7181-attention-is-all
[43]  M. Daniluk, T. Rocktäschel, J. Welbl, and S. Riedel, “Frustratingly Short Attention Spans in Neural Language Modeling.” arXiv, Feb. 15, 2017. Accessed: Jun. 04, 2024. [Online]. Available: http://arxiv.org/abs/1702.04521. https://doi.org/10.48550/arXiv.1702.04521.
[44]  T. Satyapanich, F. Ferraro, and T. Finin, “Casie: Extracting cybersecurity event information from text,” in Proceedings of the AAAI conference on artificial intelligence, 2020, pp. 8749–8757. Accessed: May 22, 2024. [Online]. Available: https://ojs.aaai.org/index.php/AAAI/article/view/6401
[45]  V. Behzadan, C. Aguirre, A. Bose, and W. Hsu, “Corpus and deep learning classifier for collection of cyber threat indicators in twitter stream,” in 2018 IEEE International Conference on Big Data (Big Data), IEEE, 2018, pp. 5002–5007. Accessed: May 22, 2024. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/8622506
[46]  S. K. Lim, A. O. Muis, W. Lu, and C. H. Ong, “Malwaretextdb: A database for annotated malware articles,” in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2017, pp. 1557–1567. Accessed: May 22, 2024. [Online]. Available: https://aclanthology.org/P17-1143/
[47]  A. Roy, Y. Park, and S. Pan, “Predicting malware attributes from cybersecurity texts,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 2019, pp. 2857–2861. Accessed: May 22, 2024. [Online]. Available: https://aclanthology.org/N19-1293/
[48]  P. Stenetorp, S. Pyysalo, G. Topić, T. Ohta, S. Ananiadou, and J. Tsujii, “BRAT: a web-based tool for NLP-assisted text annotation,” in Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics, 2012, pp. 102–107. Accessed: May 22, 2024. [Online]. Available: https://aclanthology.org/E12-2021.pdf
[49]  “CVEProject/cvelist.” CVE Program, May 22, 2024. Accessed: May 22, 2024. [Online]. Available: https://github.com/CVEProject/cvelist
[50]  S. Roy, E. Panaousis, C. Noakes, A. Laszka, S. Panda, and G. Loukas, “SoK: The MITRE ATT&CK Framework in Research and Practice.” arXiv, Apr. 14, 2023. Accessed: May 22, 2024. [Online]. Available: http://arxiv.org/abs/2304.07411
[51]  J. R. Hobbs, “Resolving pronoun references,” Lingua, vol. 44, no. 4, pp. 311–338, 1978. https://doi.org/10.1016/0024-3841(78)90006-2.
[52]   “Annotated English Gigaword - Linguistic Data Consortium.” Accessed: Apr. 12, 2019. [Online]. Available: https://catalog.ldc.upenn.edu/LDC2012T21
[53]  M. Vilain, J. Burger, J. Aberdeen, D. Connolly, and L. Hirschman, “A model-theoretic coreference scoring scheme,” in Proceedings of the 6th conference on Message understanding, Association for Computational Linguistics, 1995, pp. 45–52.

[54]  A. Bagga and B. Baldwin, “Algorithms for scoring coreference chains,” in The first international conference on language resources and evaluation workshop on linguistics coreference, Granada, 1998, pp. 563–566.
[55]  X. Luo, “On coreference resolution performance metrics,” in Proceedings of the conference on human language technology and empirical methods in natural language processing, Association for Computational Linguistics, 2005, pp. 25–32.
[56]  S. Pradhan, A. Moschitti, N. Xue, O. Uryupina, and Y. Zhang, “CoNLL-2012 shared task: Modeling multilingual unrestricted coreference in OntoNotes,” in Joint Conference on EMNLP and CoNLL-Shared Task, Association for Computational Linguistics, 2012, pp. 1–40.
[57]  B. Kantor and A. Globerson, “Coreference Resolution with Entity Equalization,” in Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019, pp. 673–677.
[59]  Amudha, M., M. Ramachandran, Vimala Saravanan, P. Anusuya, and R. Gayathri. "A study on TOPSIS MCDM techniques and its application." Data Analytics and Artificial Intelligence 1, no. 1 pp. 09-14, 2021. doi:10.46632/daai/1/1/2.
[60]  Yazdani, Morteza, and Felipe R. Graeml. "VIKOR and its applications: A state-of-the-art survey." Int. J. Strateg. Decis. Sci.5, no. 2 , pp. 56-83, 2014. DOI: 10.4018/ijsds.2014040105.