Integrating YOLOv7 with FixMatch for Enhancing Vehicle Detection Performance in Mixed Traffic Environments

##plugins.themes.bootstrap3.article.main##

Yandri Zaita
Khairun Saddami
Nasaruddin Nasaruddin

Keywords

Object Detection, YOLOv7, FixMatch, Semi-supervised, Mixed Traffic

Abstract

A major challenge in the development of object detection technology is the significant reliance on large labeled datasets, which requires substantial time and memory for manual annotation—especially in complex, mixed traffic environments with varied vehicle types, congestion levels, and unpredictable motion patterns. This study addresses this issue by integrating the semi-supervised learning technique, FixMatch, into the YOLOv7 object detection model, utilizing 4000 transportation-related datasets. The FixMatch technique enables the model to detect unlabeled objects effectively through strong and weak augmentation methods. In this study, the detected objects in the mixed traffic environment include public transportation, pedicabs, cars, motorcycles, and trucks. This study achieved an impressive 97.5% detection accuracy by leveraging unlabeled data, demonstrating the model's efficiency and effectiveness in identifying vehicles under diverse traffic conditions. Consequently, integrating the FixMatch method into YOLOv7 provides a practical and efficient solution for object detection in situations where collecting labeled data is challenging, such as in dynamic and highly variable traffic environments.

References

[1] Amit, P. Felzenszwalb, and R. Girshick, “Object Detection,” Comput. Vis., no. January, pp. 1–9, 2020, doi: 10.1007/978-3-030-03243-2_660-1.
[2] R. Padilla, S. L. Netto, and E. A. B. Da Silva, “A Survey on Performance Metrics for Object-Detection Algorithms,” Int. Conf. Syst. Signals, Image Process., vol. 2020-July, no. July, pp. 237–242, 2020, doi: 10.1109/IWSSIP48289.2020.9145130.
[3] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 580–587, 2014, doi: 10.1109/CVPR.2014.81.
[4] R. Girshick, “Fast R-CNN,” Proc. IEEE Int. Conf. Comput. Vis., vol. 2015 Inter, pp. 1440–1448, 2015, doi: 10.1109/ICCV.2015.169.
[5] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 6, pp. 1137–1149, 2017, doi: 10.1109/TPAMI.2016.2577031.
[6] P. K. Mandal, C. Leo, and C. Hurley, “Horizontal Federated Computer Vision,” pp. 1–11, 2023, [Online]. Available: https://arxiv.org/abs/2401.00390
[7] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2016-Decem, pp. 779–788, 2016, doi: 10.1109/CVPR.2016.91.
[8] D. Mahajan et al., “Exploring the Limits of Weakly Supervised Pretraining,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 11206 LNCS, pp. 185–201, 2018, doi: 10.1007/978-3-030-01216-8_12.
[9] J. Redmon and A. Farhadi, “YOLO9000: Better, faster, stronger,” Proc. - 30th IEEE Conf. Comput. Vis. Pattern Recognition, CVPR 2017, vol. 2017-Janua, pp. 6517–6525, 2017, doi: 10.1109/CVPR.2017.690.
[10] J. Redmon and A. Farhadi, “YOLOv3: An Incremental Improvement,” 2018, [Online]. Available: https://arxiv.org/abs/1804.02767
[11] A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, “YOLOv4: Optimal Speed and Accuracy of Object Detection,” 2020, [Online]. Available: https://arxiv.org/abs/2004.10934
[12] C. Li et al., “YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications”.
[13] C.-Y. Wang, A. Bochkovskiy, and H.-Y. M. Liao, “YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors,” pp. 7464–7475, 2023, doi: 10.1109/cvpr52729.2023.00721.
[14] D.-H. Lee, “Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks,” ICML 2013 Work. Challenges Represent. Learn., pp. 1–6, 2013, [Online]. Available: https://www.kaggle.com/blobs/download/forum-message-attachment-files/746/pseudo_label_final.pdf
[15] P. Chen, L. Li, X. Zhang, S. Wang, and A. Tan, “Blind quality index for tone-mapped images based on luminance partition,” Pattern Recognit, vol. 89, pp. 108–118, May 2019, doi: 10.1016/j.patcog.2019.01.010.
[16] P. Cai, Y. Lee, Y. Luo, D. Hsu, and R. O. Mar, “SUMMIT : A Simulator for Urban Driving in Massive Mixed Traffic”.
[17] A. Afdhal, K. Saddami, S. Sugiarto, Z. Fuadi, and N. Nasaruddin, “Real-time object detection performance of yolov8 models for self-driving cars in a mixed traffic environment”. In 2023 2nd International conference on computer system, information technology, and electrical engineering (COSITE) (pp. 260-265). IEEE. 2023.
[18] A. Afdhal, K. Saddami, M. Arief, S. Sugiarto, Z. Fuadi, and N. Nasaruddin, “MXT-YOLOv7t: An Efficient Real-Time Object Detection for Autonomous Driving in Mixed Traffic Environments”. IEEE Access, in press, IEEE. 2024.
[19] M. Devgan, G. Malik, and D. K. Sharma, Semi-Supervised Learning. 2020. doi: 10.1002/9781119654834.ch10.
[20] D. Reis, J. Kupec, J. Hong, and A. Daoudi, “Real-Time Flying Object Detection with YOLOv8,” 2023, [Online]. Available: https://arxiv.org/abs/2305.09972.

Similar Articles

1 2 3 4 5 6 7 8 9 > >> 

You may also start an advanced similarity search for this article.