An End to End System for Online Handwritten Bangla Character Recognition

Show simple item record

dc.contributor.author Nahin, Shahriar Nur
dc.contributor.author Imam, Kazi Fahim
dc.contributor.author Rahman, Nabil
dc.contributor.author Tasnim, Anika
dc.date.accessioned 2023-04-28T04:46:11Z
dc.date.available 2023-04-28T04:46:11Z
dc.date.issued 2022-05-30
dc.identifier.citation [1] B. B. Chaudhuri and U. Pal, “A complete printed Bangla OCR system,” Pattern Recognit., vol. 31, no. 5, pp. 531–549, 1998, doi: 10.1016/S0031-3203(97)00078-2. [2] F. Yeasmin Omee, S. Shabbir Himel, and A. Naser Bikas, “A Complete Workflow for Development of Bangla OCR,” Int. J. Comput. Appl., vol. 21, no. 9, pp. 1–6, 2011, doi: 10.5120/2543-3483. [3] A. K. M. S. A. Rabby, S. Haque, M. S. Islam, S. Abujar, and S. A. Hossain, Ekush: A Multipurpose and Multitype Comprehensive Database for Online Off-Line Bangla Handwritten Characters, vol. 1037. Springer Singapore, 2019. [4] J. Ferdous, S. Karmaker, A. K. M. S. A. Rabby, and S. A. Hossain, “MatriVasha: A Multipurpose Comprehensive Database for Bangla Handwritten Compound Characters,” Lect. Notes Networks Syst., vol. 164, pp. 813–821, 2021, doi: 10.1007/978-981-15-9774-9_74 [5] M. F. Mridha, A. Q. Ohi, M. A. Ali, M. I. Emon, and M. M. Kabir, “BanglaWriting : A multi-purpose offline Bangla handwriting dataset,” Data Br., vol. 34, p. 106633, 2021, doi: 10.1016/j.dib.2020.106633. [6] P. P. Roy, A. K. Bhunia, A. Das, P. Dey, and U. Pal, “HMM-based Indic handwritten word recognition using zone segmentation,” Pattern Recognit., vol. 60, pp. 1057– 1075, 2016, doi: 10.1016/j.patcog.2016.04.012. [7] S. Malakar, R. Sarkar, S. Basu, M. Kundu, and M. Nasipuri, “An image database of handwritten Bangla words with automatic benchmarking facilities for character segmentation algorithms,” Neural Comput. Appl., vol. 33, no. 1, pp. 449–468, 2021, doi: 10.1007/s00521-020-04981-w. [8] S. Basu, R. Sarkar, N. Das, M. Kundu, M. Nasipuri, and D. K. Basu, “A fuzzy technique for segmentation of handwritten Bangla word images,” Proc. - Int. Conf. Comput. Theory Appl. ICCTA 2007, no. March, pp. 427–432, 2007, doi: 10.1109/ICCTA.2007.7. 38 [9] R. Sarkar, N. Das, S. Basu, M. Kundu, M. Nasipuri, and D. K. Basu, “CMATERdb1: A database of unconstrained handwritten Bangla and Bangla-English mixed script document image,” Int. J. Doc. Anal. Recognit., vol. 15, no. 1, pp. 71–83, 2012, doi: 10.1007/s10032-011-0148-6. [10] Y. Park, “Discrete Hough transform using line segment representation for line detection,” Opt. Eng., vol. 50, no. 8, p. 087004, 2011, doi: 10.1117/1.3607414. [11] V. Papavassiliou, T. Stafylakis, V. Katsouros, and G. Carayannis, “Handwritten document image segmentation into text lines and words,” Pattern Recognit., vol. 43, no. 1, pp. 369–377, 2010, doi: 10.1016/j.patcog.2009.05.007. [12] A. K. M. Shahariar, A. Rabby, S. Haque, S. Abujar, and S. A. Hossain, “EkushNet : Using Convolutional Neural Network for Bangla Handwritten character recognition” Procedia Comput. Sci., vol. 143, no. December, pp. 603–610, 2018, doi: 10.1016/j.procs.2018.10.437. [13] S. Haque, S. Abujar, S. Abujar, S. Akhter, and S. A. Hossain, “BornoNet : Bangla Handwritten Characters Recognition Using Convolutional Neural Network Convolutional Neural Network,” vol. 00, 2018. [14] S. Irfan and A. Meerza, “Performance Evaluation of Different Algorithms for Handwritten Isolated Bangla Character Recognition,” 2019 Int. Conf. Robot. Signal Process. Tech., pp. 412–416, 2019. [15] M. Biswas et al., “BanglaLekha-Isolated: A Comprehensive Bangla Handwritten Character Dataset,” pp. 1–4, 2017, [Online]. Available: http://arxiv.org/abs/1703. 10661. [16] R. Sarkar, S. Malakar, N. Das, S. Basu, M. Kundu, and M. Nasipuri, “Word extraction and character segmentation from text lines of unconstrained handwritten Bangla document images,” J. Intell. Syst., vol. 20, no. 3, pp. 227–260, 2011, doi: 10.1515/JISYS.2011.013. 39 [17] P. K. Singh, S. Sinha, S. P. Chowdhury, R. Sarkar, and M. Nasipuri, “Word segmentation from unconstrained handwritten Bangla document images using distance transform,” 6th Int. Conf. Adv. Comput. Control. Telecommun. Technol. ACT 2015, pp. 473–484, 2015, doi: 10.1515/9783110450101-041. [18] U. Pal and S. Datta, “Segmentation of Bangla unconstrained handwritten text,” Proc. Int. Conf. Doc. Anal. Recognition, ICDAR, vol. 2003-Janua, no. August, pp. 1128–1132, 2003, doi: 10.1109/ICDAR.2003.1227832. [19] M. Z. Alom, P. Sidike, M. Hasan, T. M. Taha, and V. K. Asari, “Handwritten Bangla Character Recognition Using the State-of-the-Art Deep Convolutional Neural Networks,” Comput. Intell. Neurosci., vol. 2018, pp. 1–12, 2018, doi: 10.1155/2018/6747098. [20] D. Liu and J. Yu, “Otsu method and K-means,” Proc. - 2009 9th Int. Conf. Hybrid Intell. Syst. HIS 2009, vol. 1, no. 2, pp. 344–349, 2009, doi: 10.1109/HIS.2009.74. [21] H. Fan, L. Zhu, and Y. Tang, “Skew detection in document images based on rectangular active contour,” Int. J. Doc. Anal. Recognit., vol. 13, no. 4, pp. 261–269, 2010, doi: 10.1007/s10032-010-0119-3. [22] N. Ouwayed and A. Belaত্তd, “A general approach for multi-oriented text line extraction of handwritten documents,” Int. J. Doc. Anal. Recognit., vol. 15, no. 4, pp. 297–314, 2012, doi: 10.1007/s10032-011-0172-6. [23] L. Zhou, Y. Lu, and C. L. Tan, “Bangla/English script identification based on analysis of connected component profiles,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 3872 LNCS, pp. 243–254, 2006, doi: 10.1007/11669487_22. [24] “GitHub - subashis-dev/Bangla-handwritten-word-segmentation-from-document: Segment all the words from a Bengali handwritten document easily.” https://gith ub.com/subashis-dev/Bangla-handwritten-word-segmentation-from-document (accessed Feb. 23, 2022). 40 [25] H. Kong, “A MEDIAL AXIS BASED THINNING STRATEGY AND STRUCTURAL FEATURE Soumen Bag and Gaurav Harit Department of Computer Science and Engineering Indian Institute of Technology Kharagpur , Kharagpur-721 302 , India,” pp. 2173–2176, 2010. [26] B. B. Chaudhuri, “Corpus-based empirical analysis of form , function and frequency of characters used in Bangla.” [27] “Letter frequency.” http://simia.net/letters/ (accessed Mar. 03, 2022). en_US
dc.identifier.uri http://hdl.handle.net/123456789/1859
dc.description Supervised by Mr. A.B.M. Ashiqur Rahman, Assistant Professor, Co-Supervisor: Shahriar Ivan Lecturer, Department of Computer Science and Engineering(CSE), Islamic University of Technology (IUT) Board Bazar, Gazipur-1704, Bangladesh. This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2022. en_US
dc.description.abstract This report summarizes the attempt to find the way towards building an Optical Character Recognition System for handwritten Bangla characters. The complex and unique structure of scripts like Bangla and ever challenging nature of hand- written texts combined makes it really difficult to complete a perfect system to approach to convert the scanned handwritten Bangla scripts to machine editable digital counterpart format of it- as segmentation of the whole image into char- acters and then classification of the segmented characters is difficult enough to make the task challenging. In our work, we propose to approach the segmentation process (directly segment to words) with Distance Transform and morphological operations for error correction later. Then two zone approach (either side of matra- upper and lower zone) and apply connected component analysis on both zones. We handled or adjusted the failed and not directly successful cases by experimenting with the characteristics of handwritten characters. Then for clas- sification process, we proposed to classify the segmented characters using neural networks trained on the relatively newly available datasets. Multiple column, Mixed characters (Bangla- other languages) and Scene Text Recognition is out of the scope of our study so far. And we could not include the post-processing part for our work for lack of work or mention in existing literature, which might be a great addition in the way of building a complete OCR system. en_US
dc.language.iso en en_US
dc.publisher Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur, Bangladesh en_US
dc.subject Bangla Handwritten Character Recognition, Handwritten Doc- ument Recognition, Optical character recognition en_US
dc.title An End to End System for Online Handwritten Bangla Character Recognition en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search IUT Repository


Advanced Search

Browse

My Account

Statistics