Recognizing Traffic Signs using Fine-tuning Based Few-Shot Object Detection

Show simple item record

dc.contributor.author Rahman, Md. Atiqur
dc.contributor.author Asad, Nahian Ibn
dc.contributor.author Omi, Md. Mushfiqul Haque
dc.date.accessioned 2024-09-02T06:06:02Z
dc.date.available 2024-09-02T06:06:02Z
dc.date.issued 2023-05-30
dc.identifier.citation [1] M. Kumar, S. Gupta, and A. Garg, “Improved object recognition results using sift and orb feature detector,” Multimedia Tools and Applications, vol. 78, 12 2019. [2] A. Rosebrock, “Traffic sign classification with keras and deep learning,” Traf fic Sign Classification with Keras and Deep Learning, vol. 11, no. 4, p. 12019, 2019. [3] C. M. Nestel, “Designing an experience: Maps and signs at the archaeological site of ancient troy,” Cartographic Perspectives, no. 94, pp. 25–47, 2019. [4] G. Han, S. Huang, J. Ma, Y. He, and S.-F. Chang, “Meta faster r-cnn: To wards accurate few-shot object detection with attentive feature alignment,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 1, 2022, pp. 780–789. [5] Q. Fan, W. Zhuo, C.-K. Tang, and Y.-W. Tai, “Few-shot object detection with attention-rpn and multi-relation detector,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 4013– 4022. [6] X. Wang, T. E. Huang, T. Darrell, J. E. Gonzalez, and F. Yu, “Frustratingly simple few-shot object detection,” arXiv preprint arXiv:2003.06957, 2020. [7] K. Guirguis, A. Hendawy, G. Eskandar, M. Abdelsamad, M. Kayser, and J. Beyerer, “Cfa: Constraint-based finetuning approach for generalized few shot object detection,” in Proceedings of the IEEE/CVF Conference on Com puter Vision and Pattern Recognition, 2022, pp. 4039–4049. [8] L. Qiao, Y. Zhao, Z. Li, X. Qiu, J. Wu, and C. Zhang, “Defrcn: Decoupled faster r-cnn for few-shot object detection,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 8681–8690. REFERENCES 48 [9] B. Sun, B. Li, S. Cai, Y. Yuan, and C. Zhang, “Fsce: Few-shot object detection via contrastive proposal encoding,” in Proceedings of the IEEE/CVF Confer ence on Computer Vision and Pattern Recognition, 2021, pp. 7352–7362. [10] Z. Zou, Z. Shi, Y. Guo, and J. Ye, “Object detection in 20 years: A survey,” arXiv preprint arXiv:1905.05055, 2019. [11] S. Pouyanfar, S. Sadiq, Y. Yan, H. Tian, Y. Tao, M. P. Reyes, M.-L. Shyu, S.-C. Chen, and S. S. Iyengar, “A survey on deep learning: Algorithms, techniques, and applications,” ACM Computing Surveys (CSUR), vol. 51, no. 5, pp. 1–36, 2018. [12] S. Zhang, L. Yao, A. Sun, and Y. Tay, “Deep learning based recommender system: A survey and new perspectives,” ACM Computing Surveys (CSUR), vol. 52, no. 1, pp. 1–38, 2019. [13] Y. Wang, Q. Yao, J. T. Kwok, and L. M. Ni, “Generalizing from a few exam ples: A survey on few-shot learning,” ACM computing surveys (csur), vol. 53, no. 3, pp. 1–34, 2020. [14] P. Jiang, D. Ergu, F. Liu, Y. Cai, and B. Ma, “A review of yolo algorithm developments,” Procedia Computer Science, vol. 199, pp. 1066–1073, 2022. [15] R. Girshick, “Fast r-cnn,” in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1440–1448. [16] X. Ren, W. Zhang, M. Wu, C. Li, and X. Wang, “Meta-yolo: Meta-learning for few-shot traffic sign detection via decoupling dependencies,” Applied Sci ences, vol. 12, no. 11, p. 5543, 2022. [17] Y.-X. Wang, D. Ramanan, and M. Hebert, “Meta-learning to detect rare ob jects,” in Proceedings of the IEEE/CVF International Conference on Com puter Vision, 2019, pp. 9925–9934. [18] X. Yan, Z. Chen, A. Xu, X. Wang, X. Liang, and L. Lin, “Meta r-cnn: To wards general solver for instance-level low-shot learning,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 9577– 9586. [19] A. Nichol, J. Achiam, and J. Schulman, “On first-order meta-learning algo rithms,” arXiv preprint arXiv:1803.02999, 2018. [20] A. Bellet, A. Habrard, and M. Sebban, “Metric learning,” Synthesis lectures on artificial intelligence and machine learning, vol. 9, no. 1, pp. 1–151, 2015. [21] Y. Guo, H. Shi, A. Kumar, K. Grauman, T. Rosing, and R. Feris, “Spot tune: transfer learning through adaptive fine-tuning,” in Proceedings of the REFERENCES 49 IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 4805–4814. [22] W. Xiong and L. Liu, “Cd-fsod: A benchmark for cross-domain few-shot ob ject detection,” arXiv preprint arXiv:2210.05311, 2022. [23] F. Rahutomo, T. Kitasuka, and M. Aritsugi, “Semantic cosine similarity,” in The 7th international student conference on advanced science and technology ICAST, vol. 4, no. 1, 2012, p. 1. [24] S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time ob ject detection with region proposal networks,” Advances in neural information processing systems, vol. 28, 2015. [25] G. Koch, R. Zemel, R. Salakhutdinov et al., “Siamese neural networks for one-shot image recognition,” in ICML deep learning workshop, vol. 2. Lille, 2015, p. 0. [26] B. Li, W. Wu, Q. Wang, F. Zhang, J. Xing, and J. Yan, “Siamrpn++: Evolution of siamese visual tracking with very deep networks,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 4282–4291. [27] A. Kuznetsova, H. Rom, N. Alldrin, J. Uijlings, I. Krasin, J. Pont-Tuset, S. Ka mali, S. Popov, M. Malloci, A. Kolesnikov et al., “The open images dataset v4,” International Journal of Computer Vision, vol. 128, no. 7, pp. 1956–1981, 2020. [28] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE conference on com puter vision and pattern recognition. Ieee, 2009, pp. 248–255. [29] P. Khosla, P. Teterwak, C. Wang, A. Sarna, Y. Tian, P. Isola, A. Maschinot, C. Liu, and D. Krishnan, “Supervised contrastive learning,” Advances in Neu ral Information Processing Systems, vol. 33, pp. 18 661–18 673, 2020. [30] G. Elsayed, D. Krishnan, H. Mobahi, K. Regan, and S. Bengio, “Large margin deep networks for classification,” Advances in neural information processing systems, vol. 31, 2018. [31] Y. Li, H. Zhu, Y. Cheng, W. Wang, C. S. Teo, C. Xiang, P. Vadakkepat, and T. H. Lee, “Few-shot object detection via classification refinement and distrac tor retreatment,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 15 395–15 403. REFERENCES 50 [32] Z. Tian, C. Shen, H. Chen, and T. He, “Fcos: Fully convolutional one-stage object detection,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 9627–9636 en_US
dc.identifier.uri http://hdl.handle.net/123456789/2149
dc.description Supervised by Dr. Md. Hasanul Kabir, Professor, Co-Supervisor Mr. Sabbir Ahmed, Assistant Professor, Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur-1704, Bangladesh en_US
dc.description.abstract The most critical task for the Advanced Driver Assistance System (ADAS) which is generally used in autonomous vehicles is to develop a reliable and fast Traffic Sign Recognition (TSR) system. TSR identifies the traffic sign from an image and then determines its category. The majority of widely used TSR techniques that rely on deep convolutional neural networks (DCNNs) emphasize on discriminating feature learning against visual differences of different traffic signs. But these techniques perform poorly if the number of samples available for each of the category is limited to model training. To overcome this problem, few-shot learning can be used where the approach focuses on learning common but distinctive qualities of class-specific objects with few training samples, as opposed to depending heavily on supervision to learn discriminating features. In this work, we have used fine-tuning approach for few-shot learning in order to recognize traffic signs with only a limited number of samples per category. We have introduced Domain Adaptation, Warm Model, Pseudo-Support Set and Instance-Level Feature Normalization in our base architec ture. Our model outperformed all state-of-the-art (SOTA) architectures for few-shot learning across different shot settings, including 2, 3, 5, and 10 shots. Particularly, our model achieved remarkable results in 3-shot and 5-shot scenarios, with an addi tional mAP improvement of 3.53 and 3.73, respectivel en_US
dc.language.iso en en_US
dc.publisher Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur-1704, Bangladesh en_US
dc.title Recognizing Traffic Signs using Fine-tuning Based Few-Shot Object Detection en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search IUT Repository


Advanced Search

Browse

My Account

Statistics