Bangla Dataset Generation for Natural Language Inference

Show simple item record

dc.contributor.author Islam, Md. Shohidul
dc.contributor.author Khan, Abdun Nayeem
dc.contributor.author Nizami, Md Shaidur Rahman
dc.date.accessioned 2024-09-02T05:46:02Z
dc.date.available 2024-09-02T05:46:02Z
dc.date.issued 2023-05-30
dc.identifier.uri http://hdl.handle.net/123456789/2147
dc.description Supervised by Dr. Hasan mahmud, Associate Professor, Prof. Dr. Kamrul Hasan, Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur-1704, Bangladesh Board Bazar, Gazipur, Bangladesh en_US
dc.description.abstract Understanding entailment and contradiction is fundamental to understanding nat ural language, and inference about entailment and contradiction is a valuable test ing ground for the development of semantic representations. However, machine learning research in this area has been dramatically limited by the lack of resources in Bangla. To address this, we propose to introduce our own corpus curated for natural language inference which is labeled pairs of sentences with a label that depicts their inner entailment. Our goal is to create a dataset that has over 30K instances and to do so we have now created a Bangla dataset by machine trans lating the SNLI corpus into Bangla. After that, we show that benchmark models can be used to evaluate and do the task of inference in Bangla . We hope that our dataset will catalyze research in Bangla sentence understanding by providing an informative standard evaluation task.For this we provided two baseline models which are both considered integral in the task of inference in any langauge. en_US
dc.language.iso en en_US
dc.publisher Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur-1704, Bangladesh en_US
dc.subject entailment, contradiction, neutral, natural language, inference, seman tic representations, machine learning, Bangla, corpus, labeled pairs of sentences, inner entailment, dataset, instances, SNLI corpus, machine translation, benchmark models, evaluation task, baseline models, sentence understanding en_US
dc.title Bangla Dataset Generation for Natural Language Inference en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search IUT Repository


Advanced Search

Browse

My Account

Statistics