Abstract:
High-resource languages, such as English, have access to a plethora of datasets with various
question-answer types resembling real-world reading comprehension. However, there is a
severe lack of diverse and comprehensive question-answering datasets in under-resourced
languages like Bangla. The ones available are either translated versions of English datasets
with a niche answer format or created by human annotations focusing on a specific domain,
question type, or answer type. To address these limitations, we introduce BanglaRQA, a
reading comprehension-based Bangla question-answering dataset with various question answer types. BanglaRQA consists of 3,000 context passages and 14,889 question-answer pairs
created from those passages. The dataset comprises answerable and unanswerable questions
covering four unique categories of questions and three types of answers. In addition, we also
implemented four different Transformer models for question-answering on the proposed
dataset. The best-performing model achieved an overall 62.42% EM and 78.11% F1 score.
However, detailed analyses showed that the performance varies across question-answer
types, leaving room for substantial improvement of the model performance. Furthermore,
we demonstrated the effectiveness of BanglaRQA as a training resource by showing strong
results on the bn_squad dataset.
We focus on Bangla Question-answer pair generation for the next part of our work.
Bangla, being a less explored language in NLP, lacks comprehensive research in the do main of question-answer pair generation. We focus on developing this untapped sector
by fine-tuning BanglaT5, a generative model on the BanglaRQA dataset. The quality of the
generated questions is first evaluated using various metrics. The best-performing model,
BanglaT5, achieved a BLEU score of 21.56 and a BERT score of 85.04, indicating that the
generated questions exhibit decent quality. Subsequently, the research progresses toward
the main task of generating question-answer pairs. The quality of the generated pairs is
evaluated through human assessment and baseline comparison, demonstrating that the
generated QA pairs possess comparable quality to human-annotated QA pairs. Therefore, this
work proposes an end-to-end Question-Answer-Generation (QAG) pipeline and presents a
reading-comprehension-based dataset, that has the potential to contribute to future research
Description:
Supervised by
Dr. Md. Azam Hossain,
Assistant Professor,
Department of Computer Science and Engineering(CSE),
Islamic University of Technology(IUT),
Board Bazar, Gazipur-1704, Bangladesh