3D Object Detection with Stereo Vision and Transformer

Show simple item record

dc.contributor.author Samin, Abid Ahsan
dc.contributor.author Hassan, Abdullah
dc.contributor.author Khan, Md. Rakib Hossain
dc.date.accessioned 2023-04-28T03:38:40Z
dc.date.available 2023-04-28T03:38:40Z
dc.date.issued 2022-05-30
dc.identifier.citation [1] Alexey Bochkovskiy, Chien-Yao Wang, and Hong-Yuan Mark Liao. Yolov4: Optimal speed and accuracy of object detection. CoRR, abs/2004.10934, 2020. [2] Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. End-to-end object detection with transformers. In ECCV, 2020. [3] Jia-Ren Chang and Yong-Sheng Chen. Pyramid stereo matching network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5410–5418, 2018. [4] Jie Hu, Li Shen, and Gang Sun. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018. [5] Alex Kendall, Hayk Martirosyan, Saumitro Dasgupta, Peter Henry, Ryan Kennedy, Abraham Bachrach, and Adam Bry. End-to-end learning of geometry and context for deep stereo regression. 03 2017. [6] Jason Ku, Melissa Mozifian, Jungwook Lee, Ali Harakeh, and Steven L Waslander. Joint 3d proposal generation and object detection from view aggregation. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 1–8. IEEE, 2018. [7] Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollar. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Oct 2017. 49 [8] Xinzhu Ma, Shinan Liu, Zhiyi Xia, Hongwen Zhang, Xingyu Zeng, andWanli Ouyang. Rethinking pseudo-lidar representation. In European Conference on Computer Vision, pages 311–327. Springer, 2020. [9] Muhammad Mirza, Cornelius Buerkle, Julio Jarquin, Michael Opitz, Fabian Oboril, Kay-Ulrich Scholl, and Horst Bischof. Robustness of object detectors in degrading weather conditions. 06 2021. [10] Charles R Qi, Wei Liu, ChenxiaWu, Hao Su, and Leonidas J Guibas. Frustum pointnets for 3d object detection from rgb-d data. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 918–927, 2018. [11] Rui Qian, Divyansh Garg, Yan Wang, Yurong You, Serge Belongie, Bharath Hariharan, Mark Campbell, Kilian Q. Weinberger, and Wei-Lun Chao. Endto- end pseudo-lidar for image-based 3d object detection. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5880–5889, 2020. [12] Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. You only look once: Unified, real-time object detection. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 779–788, 2016. [13] Shaoshuai Shi, Xiaogang Wang, and Hongsheng Li. Pointrcnn: 3d object proposal generation and detection from point cloud. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019. [14] Maxime Tremblay, Shirsendu S. Halder, Raoul de Charette, and Jean-François Lalonde. Rain rendering for evaluating and improving robustness to bad weather. International Journal of Computer Vision, 2020. 50 [15] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. 2017. [16] Yan Wang, Wei-Lun Chao, Divyansh Garg, Bharath Hariharan, Mark Campbell, and Kilian Weinberger. Pseudo-lidar from visual depth estimation: Bridging the gap in 3d object detection for autonomous driving. In CVPR, 2019. [17] Bin Xu and Zhenzhong Chen. Multi-level fusion based 3d object detection from monocular images. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2345–2353, 2018. [18] Yurong You, YanWang,Wei-Lun Chao, Divyansh Garg, Geoff Pleiss, Bharath Hariharan, Mark Campbell, and Kilian Q Weinberger. Pseudo-lidar++: Accurate depth for 3d object detection in autonomous driving. In ICLR, 2020. en_US
dc.identifier.uri http://hdl.handle.net/123456789/1857
dc.description Supervised by Dr. Md. Kamrul Hasan, Professor, Department of Computer Science and Engineering(CSE), Islamic University of Technology (IUT) Board Bazar, Gazipur-1704, Bangladesh. This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2022 en_US
dc.description.abstract Identifying 3D objects with computer vision in a precise manner has been a challenging task in the field of autonomous driving. Partly because it requires proper depth estimation. Until now, Li-DAR technology has been used to achieve this task which is precise but also expensive. The introduction of pseudo Li-DAR promises an alternative approach which is cheaper with fairly good precision. However, pseudo Li-DAR can be replaced with 2D image representation with similar precision. Transformer is another technology which is widely used to process sequential data. Recent studies show that transformer can also be used for object detection purposes. In this literature, we look into the concept of pseudo Li-DAR, image representation of depth and detection transformer(DETR). Later, we introduce a new approach of using image based depth output with DETR to achieve accurate object detection. Finally, we compare our results with other available methods used for object detection in order to establish a benchmark. en_US
dc.language.iso en en_US
dc.publisher Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur, Bangladesh en_US
dc.subject 3D Object Detection, KITTI, Autonomous Driving en_US
dc.title 3D Object Detection with Stereo Vision and Transformer en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search IUT Repository


Advanced Search

Browse

My Account

Statistics