شناسایی و تشخیص اشیاء در تصویر با استفاده از الگوریتم یولو بهینه شده

نوع مقاله : مقاله پژوهشی

نویسندگان

1 استادیار، دانشگاه علوم و فنون هوایی شهید ستاری، تهران، ایران

2 استادیار،دانشگاه علوم و فنون هوایی شهید ستاری، تهران، ایران

چکیده

امروزه بصورت گسترده برای نظارت و کنترل محیط از سیستم‌های نظارت تصویری استفاده می‌شود. برای جلوگیری از حوادث ناخواسته و حفاظت از اماکن نظامی، امنیتی، مردم و اموال آن‌ها، سرمایه‌گذاری در سیستم‌های نظارت تصویری بسیار افزایش یافته و هدف، استفاده حداکثری از تمام دستاوردهای تکنولوژیکی موجود در این زمینه برای توسعه سیستم های نظارت تصویری است. استفاده از سیستم‌های نظارت تصویری در سازمان‌ها، ادارات، کارخانجات و محیط‌های کاری موجب مراقبت و کنترل دقیق محیط، کاهش تخلفات، افزایش توانایی در آشکار سازی سریع حوادث و نظم‌دهی محیط کاری شده است. هدف از شناسایی و تشخیص اشیاء در سامانه‌های نظارت تصویری، دسته‌‌بندی و برچسب گذاری اشیا و تعیین موقعیت دقیق آنها در تصویر یا ویدئو است.

امروزه شبکه های عصبی عمیق برای حل مسائل مختلف مورد بررسی و استفاده قرار می‌گیرند. در این مقاله ما به منظور شناسایی و تشخیص اشیاء در تصاویر از الگوریتم بهینه شده یولو استفاده نموده‌ایم.

در معماری پایه شبکه یولو از تابع فعال‌سازی ‌ریلو استفاده شده‌است که این تابع فعال‌سازی در نقطه صفر مشتق‌پذیر نیست و دوم منجر به صفر شدن تمام مقادیر منفی می‌گردد در نتیجه ما با تغییر تابع فعال‌سازی سعی در افزایش دقت شبکه عصبی پایه داریم. ما با ایجاد تغییر در معماری پایه الگوریتم یولو در شناسایی و تشخیص اشیاء، دقت شبکه‌های عصبی پایه را در شناسایی و تشخیص اشیاء در تصاویر را به میزانmAP = 0.8 درصد افزایش داده‌ایم.

کلیدواژه‌ها

موضوعات


عنوان مقاله [English]

Identification and recognition of objects in the image using the optimized YOLO algorithm

نویسندگان [English]

  • Amirhosein zanganeh 1
  • Ehsan Sharifi 2
  • Pezhman Gholamnezhad 1
1 Assistant Professor, Shahid Sattari University of Aeronautical Sciences and Technology, Tehran, Iran
2 Assistant Professor, Shahid Sattari University of Aeronautical Sciences and Technology, Tehran, Iran
چکیده [English]

Today, video surveillance systems are widely used to monitor and control the environment. To prevent unwanted incidents and protect military and security sites, people, and their property, investment in video surveillance systems has increased greatly, and the goal is to make the most of all the technological achievements in this field for the development of video surveillance systems. The use of video surveillance systems in organizations, offices, factories, and work environments has resulted in careful monitoring and control of the environment, reducing violations, and increasing the ability to quickly detect incidents and order the work environment. The purpose of identifying and recognizing objects in video surveillance systems is to categorize and label objects and determine their exact position in the image or video.

Today, deep neural networks are used to solve various problems. In this article, we have used the optimized YOLO algorithm to identify and recognize objects in images. In the basic architecture of the YOLO network, the Relo activation function is used, which is not derivable at the zero point, and secondly, it leads to the zeroing of all negative values. As a result, we are trying to increase the accuracy of the basic neural network by changing the activation function. By making a change in the basic architecture of the Yolo algorithm in identifying and recognizing objects, we have increased the accuracy of basic neural networks in identifying and recognizing objects in images by mAP = 0.8%.

کلیدواژه‌ها [English]

  • video surveillance system
  • object recognition
  • YOLO algorithm
  • deep neural network
  • activation function

Smiley face

 

[1]     A. Abdullah, G. Mohammad Amin, and Sh. F. Mohammad Hassan, "Diagnosis of road slippage using road camera images based on convolutional neural networks and transfer learning," 2022, Accessed: Mar. 12, 2024. https://www.sid.ir/paper/1082685/fa(In Persian)
[3]     A. Niloufer, M. M. Seyyed Mohammad Reza, and T. A. Sadat, "Reducing the effects of deception attacks in GPS receivers of phasor measurement units using neural networks," 2023, Accessed: Mar. 12, 2024. [Online]. Available: https://www.sid.ir/paper/1083027/fa. (In Persian)
[4]     C. Szegedy, A. Toshev, and D. Erhan, “Deep neural networks for object detection,” Advances in neural information processing systems, vol. 26, 2013, Accessed: Nov. 13, 2023. [Online]. Available: https://proceedings.neurips.cc/paper/5207-deep-neural-networks-for-object-detection
[5]     E. N. Mortensen, H. Deng, and L. Shapiro, “A SIFT descriptor with global context,” in 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), IEEE, 2005, pp. 184–190. Accessed: Nov. 13, 2023. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/1467266
[6]     X. Wang, T. X. Han, and S. Yan, “An HOG-LBP human detector with partial occlusion handling,” in 2009 IEEE 12th international conference on computer vision, IEEE, 2009, pp. 32–39. Accessed: Nov. 13, 2023. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/5459207
[7]     A. Zanganeh and M. Jampour, “Automatic Weak Learners Selection for Pattern Recognition and its application in Soccer Goal Recognition,” in 2019 4th International Conference on Pattern Recognition and Image Analysis (IPRIA), IEEE, 2019, pp. 240–245.
[8]     P. Viola and M. Jones, “Rapid object detection using a boosted cascade of simple features,” in Proceedings of the 2001 IEEE computer society conference on computer vision and pattern recognition. CVPR 2001, Ieee, 2001, p. I–I. Accessed: Nov. 13, 2023. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/990517/
[9]     W. A. Ezat, M. M. Dessouky, and N. A. Ismail, “Evaluation of deep learning yolov3 algorithm for object detection and classification,” Menoufia Journal of Electronic Engineering Research, vol. 30, no. 1, pp. 52–57, 2021.
[10]  K. Parviz, "Malware classification method using visualization features and word embedding based on deep learning," 2023, Accessed: Mar. 12, 2024. [Online]. Available: https://www.sid.ir/paper/1083015. (In Persian)
[11]  A. Zanganeh, M. Jampour, and K. Layeghi, “IAUFD: A 100k images dataset for automatic football image/video analysis,” IET Image Processing, vol. 16, no. 12, pp. 3133–3142, Oct. 2022, doi: 10.1049/ipr2.12543.
[12]  R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Region-based convolutional networks for accurate object detection and segmentation,” IEEE transactions on pattern analysis and machine intelligence, vol. 38, no. 1, pp. 142–158, 2015.
[13]  S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” Advances in neural information processing systems, vol. 28, 2015.
[14]  C.-Y. Fu, W. Liu, A. Ranga, A. Tyagi, and A. C. Berg, “DSSD : Deconvolutional Single Shot Detector.” arXiv, Jan. 23, 2017. Accessed: Nov. 13, 2023. [Online]. Available: http://arxiv.org/abs/1701.06659

[15]  Z. Shen, Z. Liu, J. Li, Y.-G. Jiang, Y. Chen, and X. Xue, “Dsod: Learning deeply supervised object detectors from scratch,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 1919–1927. Accessed: Nov. 13, 2023. [Online]. Available: http://openaccess.thecvf.com/content_iccv_2017/html/Shen_DSOD_Learning_Deeply_ICCV_2017_paper.html
[16]  G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 4700–4708.
[17]  [17] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779–788.
[18]  [18] J. Redmon and A. Farhadi, “YOLOv3: An Incremental Improvement.” arXiv, Apr. 08, 2018. Accessed: Nov. 13, 2023. [Online]. Available: http://arxiv.org/abs/1804.02767
[19]  [19] J. Redmon and A. Farhadi, “YOLO9000: better, faster, stronger,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 7263–7271.
[20]  [20] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2980–2988. Accessed: Dec. 13, 2023. [Online]. Available: http://openaccess.thecvf.com/content_iccv_2017/html/Lin_Focal_Loss_for_ICCV_2017_paper.html
[21]  W. Liu et al., “SSD: Single Shot MultiBox Detector,” in Computer Vision – ECCV 2016, vol. 9905, B. Leibe, J. Matas, N. Sebe, and M. Welling, Eds., in Lecture Notes in Computer Science, vol. 9905. , Cham: Springer International Publishing, 2016, pp. 21–37. doi: 10.1007/978-3-319-46448-0_2.