افزایش امنیت گذرواژه دیجیتالی با استفاده از یادگیری ماشین: تحلیلی مقایسه‌ای بر الگوریتم‌های طبقه‌بندی

نوع مقاله : مقاله پژوهشی

نویسندگان

1 کارشناسی ارشد گروه کامپیوتر, دانشکده فنی و مهندسی, دانشگاه میبد, میبد, ایران

2 استادیار گروه کامپیوتر , دانشکده فنی و مهندسی ,دانشگاه میبد ، میبد ، ایران

3 استادیار گروه کامپیوتر، دانشکده فنی و مهندسی، دانشگاه میبد، میبد، ایران

4 دانشیار گروه کامپیوتر، دانشکده فنی و مهندسی، دانشگاه میبد، میبد، ایران

چکیده

در عصر دیجیتال، رمز‌های عبور هم‌چنان یکی از رایج‌ترین و حیاتی‌ترین روش‌های احراز هویت در سامانه‌های اطلاعاتی به شمار می‌آیند. با این حال، انتخاب و استفاده از رمزهای عبور ضعیف و غیرایمن، هم‌چون رمزعبورهای ساده، کوتاه یا تکراری، تهدیدی جدی برای امنیت سایبری محسوب می‌شود. این پژوهش با هدف بهبود دقت در ارزیابی سطح امنیت رمز‌های عبور، چارچوبی مبتنی بر یادگیری ماشین ارائه می‌دهد. روش پیشنهادی شامل سه مرحله‌ی اصلی پیش‌پردازش، استخراج ویژگی‌های جدید و طبقه‌بندی رمز‌های عبور است. در این پژوهش، مجموعه‌ای شامل 669880 رمز‌ عبور مورد تحلیل قرار گرفته است. ابتدا در مرحله پیش‌پردازش مراحلی مانند حذف داده‌های گم‌شده، کدگذاری متن به عدد و متعادل‌سازی داده‌های کلاس‌ها با استفاده از الگوریتم SMOTE انجام شده است. بعد از اینکه داده‌ها تمیز شدند از هر رمز عبور ده ویژگی ساختاری جدید استخراج شده است. در نهایت داده‌ها به دو دسته آموزش و آزمایش تقسیم شده و دسته‌بندی سطح امنیتی رمز‌های عبور در سه کلاس ضعیف، متوسط و قوی با استفاده از الگوریتم‌های یادگیری ماشین مانند درخت تصمیم، جنگل تصادفی، XGBoost و رگرسیون لجستیک و غیره صورت گرفته است. نتایج تجربی نشان داد که الگوریتم جنگل تصادفی و روش XGBoost با معیار F1 به ترتیب با مقدار 99.665% و 99.642% درصد عملکرد برتری نسبت به سایر مدل‌ها داشته‌‌اند. دستاورد این تحقیق، ارائه مدلی کارآمد و مقیاس‌پذیر برای شناسایی نقاط ضعف رمزهای عبور و ارتقاء سیاست‌های امنیتی در سامانه‌های دیجیتال است که می‌تواند به شکل مؤثری در طراحی رمز‌های عبور مقاوم، کاهش ریسک نفوذ، و ارتقاء سطح امنیت سایبری نقش‌آفرینی کند.

کلیدواژه‌ها

موضوعات


عنوان مقاله [English]

Enhancing Digital Password Security Using Machine Learning: A Comparative Analysis of Classification Algorithms

نویسندگان [English]

  • Mahnaz dorodi 1
  • Seyed Hasan Mortazavi Zarch 2
  • Fatemeh Zare Mehrjardi 3
  • Mohsen Sardari Zarchi 4
1 Master of Science, Department of Computer Science, Faculty of Engineering, Meybod University, Meybod, Iran
2 Assistant Professor of Computer Department, Faculty of Technology and Engineering, Meybod University, Meybod, Iran
3 Assistant Professor of Computer Department, Faculty of Technology and Engineering, Meybod University, Meybod, Iran
4 Associate Professor of Computer Department, Faculty of Technology and Engineering, Meybod University, Meybod, Iran
چکیده [English]

In the digital age, passwords remain one of the primary methods of authentication in information systems. Despite their critical role in protecting personal and organizational data, the use of weak passwords such as those that are overly simple, short, or repetitive poses a significant threat to cybersecurity. This study proposes a machine learning-based approach to enhance the accuracy of password security assessment. The proposed method consists of three main steps: preprocessing, new feature extraction and password classification. In this study, a dataset with 669,880 passwords is analyzed. First, in the preprocessing step, operations such as removing missing data, encoding text to number, and fixing the problem of unbalanced classes using smote are performed. After data cleaning, 10 new features are extracted from each password. Finally, the processed dataset is partitioned for training and testing operations and strength classification of passwords are predicted using various machine learning classifiers such as Decision tree, Random Forest, XGBoost, Logistic regression and others. Experimental results shows that the Random Forest and XGBoost algorithms achieved the highest F1-score with values of 99.665% and 99.642% respectively rather than other models. The outcome of this research is a scalable and efficient framework for identifying password vulnerabilities and reinforcing security policies in digital systems. This framework can play a key role in designing robust passwords and mitigating the risk of unauthorized access.

کلیدواژه‌ها [English]

  • Machine Learning
  • Cybersecurity
  • Strength Password
  • Random Forest
   [1]      X. Yan, Y. Liu, and X. Wang, "A survey on password guessing attacks," ACM Computing Surveys (CSUR), vol. 50, no. 3, 2017, DOI 10.48550/arXiv.2212.08796.
   [2]      H. R. Khodadadi and S. Falsafi, "Improvement of security in wireless communication networks with directional modulation and artificial noise," Sci. J. Electronic and Cyber Defense, vol. 10, no. 4, 2023, DOI 20.1001.1.23224347.1401.10.4.2.2.
   [3]      V. Yadegari and A. R. Matinfar, "Detect web denial of service attacks using entropy and support vector machine algorithm," Electronic and Cyber Defense., vol. 6, no. 4, pp. 79–89, 2019, DOI 20.1001.1.23224347.1397.6.4.7.9.
   [4]      K. Dadashtabar Ahmadi and M. Mahmoudbabouei, "The presentation of an active cyber defense model for application in cyber deception technology," Electronic and Cyber Defense, vol. 9, no. 4, pp. 125–140, 2022, DOI 20.1001.1.23224347.1400.9.4.10.3.
   [5]      Y. M. Sadeghi, S. M. Agha, and F. Adibnia, "A new approach for static detection of security vulnerabilities in web applications," Electronic and Cyber Defense., vol. 2, no. 4, pp. 65–74, 2020, DOI  20.1001.1.23224347.1393.2.4.21.5.
   [6]      J. Yan, A. Blackwell, R. Anderson, and A. Grant, "Password memorability and security: Empirical results," IEEE Security & Privacy, vol. 2, no. 5, pp. 25–31, 2004, DOI 10.1109/MSP.2004.81.
   [7]      J. Tan, L. Bauer, N. Christin, and L. F. Cranor, "Practical recommendations for stronger, more usable passwords combining minimum-strength, minimum-length, and blocklist requirements," in Proc. 2020 ACM SIGSAC Conf. Computer and Communications Security (CCS), pp. 1407–1426, Oct. 2020, DOI 10.1145/3372297.3417882.
   [8]      C. Herley, and P. Van Oorschot, “A research agenda acknowledging the persistence of passwords,” IEEE Security & privacy, vol. 10, no. 1, pp. 28-36, 2011. 
   [9]      M. Antonakakis, T. April, M. Bailey, M. Bernhard, E. Bursztein, J. Cochran, and Y.  Zhou, “Understanding the mirai botnet,” In 26th USENIX security symposium (USENIX Security 17), pp. 1093-1110, 2017.
[10]      N. Lykousas and C. Patsakis, "Decoding developer password patterns: A comparative analysis of password extraction and selection practices," Computers & Security, vol. 145, Art. no. 103974, 2024, DOI 10.1016/j.cose.2024.103974.
[11]      A. Constantinides, M. Belk, C. Fidas, R. Beumers, D. Vidal, W. Huang, J. Bowles, T. Webber, A. Silvina, and A. Pitsillides, "Security and usability of a personalized user authentication paradigm: Insights from a longitudinal study with three healthcare organizations," ACM Trans. Comput. Healthcare, vol. 4, no. 1, pp. 1–40, 2023, DOI 10.1145/3564610.
[12]      M. Just, and D. Aspinall, “Personal choice and challenge questions: a security and usability assessment,” In Proceedings of the 5th Symposium on Usable Privacy and Security, pp. 1-11, 2009, DOI 10.1145/1572532.1572543. 
[13]      S. Adams and M. A. Sasse, "Users are not the enemy," Commun. ACM, vol. 42, no. 12, pp. 40–46, 1999.
[14]      S. T. Haque, M. N. Al-Ameen, M. Wright, and S. Scielzo, "Learning system-assigned passwords (up to 56 bits) in a single registration session with cognitive psychology," in Proc. NDSS Symp. (USEC), 2017.
[15]      M. N. Al-Ameen, K. Fatema, M. Wright, and S. Scielzo, "The impact of cues and user interaction on the memorability of system-assigned recognition-based graphical passwords," in Proc. 11th SOUPS, pp. 185–196, 2015. 
[16]      B. Ur, "Do users’ perceptions of password security match reality?" in Proc. 2016 CHI Conf. Human Factors in Computing Systems, pp. 3748–3760, 2016, DOI 10.1145/2858036.2858546.
[17]      D. Florencio and C. Herley, "A large-scale study of web password habits," in Proc. 16th Int. Conf. World Wide Web, pp. 657–666, 2007, DOI 10.1145/1242572.1242661.
[18]      "Generating Strong Passwords with Deep Learning," [Online]. Available: [URL Not Provided].  
[19]      O. Fierro, N. Grandi, and J. Oliva, "Superradiance of charged black holes in Einstein–Gauss–Bonnet gravity," Classical Quantum Gravity, vol. 35, no. 10, 2018, DOI 10.1088/1361-6382/aab3f6.
[20]      W. Han, M. Xu, J. Zhang, C. Wang, K. Zhang, and X. S. Wang, "TransPCFG: Transferring the grammars from short passwords to guess long passwords effectively," IEEE Trans. Inf. Forensics Security, vol. 16, pp. 451–465, 2020, DOI  10.1109/TIFS.2020.3003696.
[21]      B. Hitaj, "Passgan: A deep learning approach for password guessing," in Proc. ACNS 2019, Springer, 2019, DOI 10.48550/arXiv.1709.00440.
[22]      S. H. M. Zarch, H. Soltani, and M. S. Yazdani, "Enhace the security of password by fuzzy controller," in Proc. Iranian Conf. Intelligent Systems (ICIS), pp. 1–5, 2014, DOI  10.1109/IranianCIS.2014.6802541.
[23]      M. Jiao, "Application of Random Forest Algorithm in Network Intrusion Detection of Government Affairs Departments," Int. J. Comput. Intell. Appl., vol. 24, no. 4, 2024, DOI 10.1142/S1469026823420038.
[24]      Y. Abdrabou, "‘Your Eyes Tell You Have Used This Password Before’: Identifying Password Reuse from Gaze and Keystroke Dynamics," in Proc. 2022 CHI Conf. Human Factors in Computing Systems, pp. 1–16, 2022, DOI 10.1145/3491102.3517531.
[25]      S. Parkinson, "Password policy characteristics and keystroke biometric authentication," IET Biometrics, vol. 10, no. 2, pp. 163–178, 2021, DOI 10.1049/bme2.12017.
[26]      B. Suruthi, "Efficient handwritten passwords to overcome spyware attacks," Sci. Technol., vol. 3, pp. 1–9, 2021.
[27]      M. Ishak, "Correlation impact by random forest towards prediction of phishing website," in IOP Conf. Ser.: Mater. Sci. Eng., vol. 917, no. 1, 2020, DOI 10.1088/1757-899X/917/1/012043.
[28]      A. Nosenko, Y. Cheng, and H. Chen, "Learning password modification patterns with recurrent neural networks," in Int. Conf. Secure Knowl. Manage. AI Era, Springer, pp. 110–129, 2021, DOI 10.1007/978-3-030-97532-6_7.
[29]      A. Demenongo and A. Iorshase, "Ensemble model for the detection of phishing URLs," Ilorin J. Comput. Sci. Inf. Technol., vol. 7, no. 1, pp. 1–25, 2024.
[30]      R. Mohammed, J. Rawashdeh, and M. Abdullah, "Machine learning with oversampling and undersampling techniques: overview study and experimental results," in Proc. 2020 11th Int. Conf. Inf. Commun. Syst. (ICICS), pp. 243–248, 2020, DOI 10.1109/ICICS49469.2020.239556.
[31]      S. Wang, "Research on expansion and classification of imbalanced data based on SMOTE algorithm," Sci. Rep., vol. 11, no. 1, 2021, DOI 10.1038/s41598-021-03430-5.
[32]      S. Das Guptta, "Modeling hybrid feature-based phishing websites detection using machine learning techniques," Ann. Data Sci., vol. 11, no. 1, pp. 217–242, 2024, DOI 10.1007/s40745-022-00379-8.
[33]      J. Z. Ahmadabadi, F. Z. Mehrjardi, M. Ghanbary, and M. Mirzaei, “Identification of Effective Factors and Prediction of Ischemic Heart Disease Using Machine Learning Methods and Data from the Yazd Health Study (YaHS),” Journal of Shahid Sadoughi University of Medical Sciences, vol. 32, no. 7, pp. 8067-8079, 2024, DOI ‎ 10.18502/ssu.v32i7.16571.
[34]      M. Akbari Podineh, F. Zare Mehrjardi, and M. Sardari Zarchi, “Multimodal analysis of ECG signals for cardiac arrhythmia detection using machine learning and deep learning methods,” Applied and basic Machine intelligence research, vol. 3, no. 1, pp. 17-34, 2025, DOI 10.22034/abmir.2025.22930.1118. 
[35]      M. R. Esmaeili Noroozi, and F. Zare Mehrjardi, “Optimization of Steel Alloy Composition to Maximize Yield Strength Using a Machine Learning Model and the Cuckoo Optimization Algorithm (COA),” Engineering Management and Soft Computing, vol. 12, no. 1, pp. 131-143, 2026, DOI 10.22091/jemsc.2026.13746.1299.
[36]      R. Torkashvan, S. Parsa, and B. Vaziri, “Fault Proness Estimation of Software Modules Using Machine learning,” Electronic and Cyber Defense, vol. 11, no. 4, pp. 45-59, 2024, DOI 20.1001.1.23224347.1402.11.4.4.1
[40]      J. Mallet, "Hold on and swipe: a touch-movement based continuous authentication schema based on machine learning," in Proc. 2022 Asia Conf. Algorithms, Comput. Mach. Learn. (CACML), pp. 442–447, IEEE, 2022, DOI  10.1109/CACML55074.2022.00081.
[41]      L. Pryor, "Evaluation of a User Authentication Schema Using Behavioral Biometrics and Machine Learning," arXiv preprint arXiv:2205.08371, 2022, DOI  10.48550/arXiv.2205.08371.
[42]      M. Anwer, "Attack detection in IoT using machine learning," Eng., Technol. Appl. Sci. Res., vol. 11, no. 3, pp. 7273–7278, 2021, DOI 10.48084/etasr.4202.