Detection of Unknown Malicious Network Streams using Ensemble Learning

Document Type : Original Article

Authors

-

Abstract

Security is a significant issue in this world and is given several dimensions by varying circumstances.
Among different security areas, cyber security can be claimed to have one of the most important places in
new circumstances of this world. In this study, two virtual honeynets were designed in two different laboratories
to help us study unknown attacks. Other scientific datasets were also used for this purpose. Imbalanced
data always cause problems for network datasets and reduce the efficiency for the prediction of minority
classes. To cope with this problem, ensemble learning methods were applied in order to detect network
attacks and most specifically, unknown attacks, while taking advantage of different techniques and
action model learning. It was found that ensemble learning method was suitable for describing the security
problems because activities done on computer systems can be viewed at multiple levels of abstraction and
information can be collected from multiple data sources. Statistical analysis was used as the research
method in order to measure the reliability and validity of findings. Here, we applied statistical techniques
and tests to show that the algorithm designed by the proposed weighted voting and based on the genetic
algorithm has a better performance than other twelve classifiers.

Keywords


   [1]      R. Talabis, “Honeynet learning: discovering IT security,” SIGCSE Bull, vol. 38, pp. 110-114, 2006.##
   [2]      S. Hido and H. Kashima, “Roughly balanced bagging for imbalanced data,” Statistical Analysis and Data Mining: the ASA Data Science, vol. 5, pp. 143-152, 2009.##
   [3]      P. Kang and S. Cho, “Ensemble of under-sampled SVMs for data imbalance problems,” Lecture Notes in Computer Science, vol. 4232, pp. 837–846, 2006.##
   [4]      I. Corona, G. Giacinto, C. Mazzariello, F. Roli, and C. Sansone, “Information fusion for computer security: state of the art and open issues,” Information Fusion, vol. 10, no. 4, pp. 274–284, 2009.##
   [5]      J. Z. Kolter and M. A. Maloof, “Learning to detect and classify malicious executables in the wild,” Journal of Machine Learning Research, vol. 7, pp. 2721–2744, 2006.##
   [6]      M. G. Schultz, E. Eskin, E. Zadok, and S. J. Stolfo, “Data mining methods for detection of new malicious executables,” In Proceedings of the IEEE Symposiumon Security and Privacy, pp. 38–49, 2001.##
   [7]      G. Giacinto, R. Perdisci, M. D. Rio, and F. Roli, “Intrusion detection in computer networks by a modular ensemble of one-class classifiers,” Information Fusion, vol. 9, no. 1, pp. 69–82, 2008.##
   [8]      G. Giacinto, F. Roli, and L. Didaci, “Fusion of  multiple classifiers for intrusion detection in computer networks,” Pattern Recognition Letters, vol. 24, no. 12, pp.  1795–1803, 2003.##
   [9]       B. Zhang, J. Yin, S. Wang, and X. Yan, “Research on virus detection technique based on ensemble neural network and SVM,” Advanced Intelligent Computing Theories and Methodologies, vol. 137, pp. 24-33, 2014.##
[10]      S. M. AbdElrahman and A. Abraham, “Intrusion detection using error correcting output code based ensemble,” International Conference Hybrid Intelligent Systems (HIS), IEEE, pp. 181-186, 2014.##
[11]      M. Sreenath and J. Udhayan, “Intrusion detection system using Bagging Ensemble Selection,” International Conference Engineering and Technology (ICETECH), pp. 1-4, 2015.##
[12]      D. P. Gaikwad and R. C. Thool, “Intrusion detection System using bagging ensemble method of machine learning,” International Conference Computing Communication Control and Automation (ICCUBEA), pp. 291-295, 2015.##
[13]      P. Sornsuwit and S. Jaiyen, “Intrusion detection model based on ensemble learning for U2R and R2L attacks,” International Conference on Information Technology and Electrical Engineering (ICITEE), pp. 354-359, 2015.##
[14]      S. Masarat, H. Taheri, and S. Sharifian, “A novel framework, based on fuzzy ensemble of classifiers for intrusion detection systems,” International Conference Computer and Knowledge Engineering (ICCKE), pp.     165-170, 2014.##
[15]      M. Milliken, Y. Bi, L. Galway, and G. Hawe, “Ensemble learning utilising feature pairings for intrusion detection,” World Congress on Internet Security (WorldCIS), pp. 24-31, 2015.##
[16]      A. A. Aburomman and M. B. IbneReaz, “A novel SVM-kNN-PSO ensemble method for intrusion detection system,” Applied Soft Computing, vol. 38, pp. 360–372, 2016.##
[17]      R. Singha, H. Kumarb, and R. K. Singlac, “An intrusion detection system using network traffic profiling and online sequential extreme learning machine,” Expert Systems with Applications, vol. 42, pp. 8609–8624, 2015.##
[18]      R. M. Elbasiony, E. A. Sallam, T. E. Eltobely, and M. M. Fahmy, “A hybrid network intrusion detection framework based on random forests and weighted k-means,” Ain Shams Eng. J, vol. 4, pp. 753–762, 2013.##
[19]      D. Watson and J. Riden, “The honeynet project,” Technical Report, 2006.##
[20]      Members of the  Honeynet Project, “Know Your Enemy: Learning about Security Threats,” 2nd edn.             Addison-Wesley, Boston, 2004.##
[21]      N. Provos and T. Holz, “Virtual honeypots: from botnet tracking to intrusion detection,” 1st edn. Addison-Wesley Professional, Boston, 2007.##
[22]      L. Rokach, “Ensemble-based classifiers,’ Artificial Intelligence Review, vol. 33, pp. 1–39, 2010.##
[23]      R. Polikar, “Ensemble based systems in decision making,” Circuits and Systems Magazine, vol. 6, pp. 21–45, 2006.##
[24]      D. Opitz and R. Maclin, “Popular ensemble methods: An empirical study,” Journal of Artificial Intelligence Research, vol. 11, pp. 169–198, 1999.##
[25]      L. K. Hansen and P. Salamon, “Neural network ensembles,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 12, no. 10, pp. 993–1001, 1990.##
[26]      R. E. Schapire, “The strength of weak learnability,” Machine Learning, vol. 5, no. 2, pp. 197–227, 1990.##
[27]      Z. H. Zhou, “Ensemble methods: foundations and algorithms. machine learning & pattern recognition series,” Chapman & Hall/CRC, Boca Raton FL, 2012.##
[28]      B. M. Aslahi-Shahri, R. Rahmani, M. Chizari, A. Maralani, M. Eslami, M.J. Golkar, and A. Ebrahimi, “A hybrid method consisting of GA and SVM for intrusion detection system,” Neural Computing and Applications, pp. 1–8, 2015.##
[29]      S. Rastegari, P. Hingston, and C. P. Lam, “Evolving statistical rulesets for network intrusion detection,” Applied Soft Computing, vol. 33, pp. 348–359, 2015.##
[30]      R. Ghorbani and H. Abrishami, “Using Stereo Vision to Provide a Vision-Based Augmented Reality System,” Tabriz Journal of Electrical Eng., vol. 42, no. 1, 2012. (In Persian)##
[31]      S. Abdollahzadeh, M. A. Balafar, and L. Mohammad Khanli, “|Using Clustering and Markov Model in Predicting Web Users' Next Request,” Tabriz Journal of Electrical Eng., vol. 45, no. 3, 2014. (In Persian)##
[32]      F. H. Abbasi, R. J. Harris, S. Marsland, and G. Moretti, “An Exemplar-Based Learning Approach for Detection and Classification of Malicious Network Streams in Honeynets,” Security and Communication Networks, vol. 7, no. 2, pp. 352-364, 2014.##
 [33]      H. Parvin, S. Ansari, and S. Parvin, “Proposing a New Method for Non-Relative Imbalanced Dataset,” Soft Computing Models in Industrial and Environmental Applications, vol. 188, pp. 297-306, 2013.##
[34]      H. Parvin, B. Minaei, H. Alinejad-Rokny, and W. Punch, “Data weighing mechanisms for clustering ensembles,” Computers and Electrical Engineering, vol. 5. no. 39, pp. 1433–1450, 2013.##
[35]      R. K. Shahzad and N. Lavesson, “Comparative Analysis of Voting Schemes for Ensemble-based Malware Detection,” Wireless Mobile Networks, Ubiquitous Computing, and Dependable Applications, vol. 4, no. 1, pp. 98-117, 2013.##
[36]      A. L. Buczak and E. Guven, “A survey of data mining and machine learning methods for cyber security intrusion detection,” IEEE Communications Surveys and Tutorials, vol. 18, no. 2, pp. 1153–1176, 2016.##
[37]      A. A. Aburomman and M. B. Ibne Reaz, “A novel SVM-Knnpso ensemble method for intrusion detection system,” Applied Soft Computing Journal, vol. 38, pp. 360–372, 2016.##
[38]      M. Li, S. Pan, Y. Zhang, and X. Cai, “Classifying networked text data with positive and unlabeled examples,” Pattern Recognition Letters, vol. 77, pp. 1–7, 2016.##
[39]      S. Parsa and M. Zeinipour, “Botnet Detection with Flow Behavior Analysis Approach,” Journal of Electrical & Cyber Defence, vol. 5, no. 4, pp. 35-50, 2017. (In Persian)##
  • Receive Date: 15 July 2017
  • Revise Date: 20 February 2019
  • Accept Date: 19 September 2018
  • Publish Date: 23 July 2018