Answer-agnostic Bangla Question-answer Pair Generation Using Transformer-based Approaches

Altaf, Md Sajid; Ekram, Syed Mohammed Sartaj; Rahman, Adham Arik

dc.contributor.author	Altaf, Md Sajid
dc.contributor.author	Ekram, Syed Mohammed Sartaj
dc.contributor.author	Rahman, Adham Arik
dc.date.accessioned	2024-09-02T08:05:58Z
dc.date.available	2024-09-02T08:05:58Z
dc.date.issued	2023-05-30
dc.identifier.citation	[1] Huan Liu. Does questioning strategy facilitate second language (l2) reading compre hension? the effects of comprehension measures and insights from reader perception. Journal of Research in Reading, 44(2):339–359, 2021. [2] Pranav Rajpurkar, Robin Jia, and Percy Liang. Know what you don’t know: Unanswerable questions for SQuAD. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 784–789, Melbourne, Australia, July 2018. Association for Computational Linguistics. [3] Johannes Welbl, Pontus Stenetorp, and Sebastian Riedel. Constructing datasets for multi-hop reading comprehension across documents. Transactions of the Association for Computational Linguistics, 6:287–302, 2018. [4] Simon Šuster and Walter Daelemans. CliCR: a dataset of clinical case reports for machine reading comprehension. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL): Human Language Technologies, Volume 1 (Long Papers), pages 1551–1563, New Orleans, Louisiana, June 2018. Association for Computational Linguistics. [5] Adam Trischler, Tong Wang, Xingdi Yuan, Justin Harris, Alessandro Sordoni, Philip Bachman, and Kaheer Suleman. NewsQA: A machine comprehension dataset. In Proceedings of the 2nd Workshop on Representation Learning for NLP, pages 191–200, Vancouver, Canada, August 2017. Association for Computational Linguistics. 37 38 [6] Tri Nguyen, Mir Rosenberg, Xia Song, Jianfeng Gao, Saurabh Tiwary, Rangan Majumder, and Li Deng. Ms marco: A human generated machine reading comprehension dataset. November 2016. [7] Bingsheng Yao, Dakuo Wang, Tongshuang Wu, Zheng Zhang, Toby Jia-Jun Li, Mo Yu, and Ying Xu. It is ai’s turn to ask humans a question: Question-answer pair generation for children’s story books. arXiv preprint arXiv:2109.03423, 2021. [8] Sheng Shen, Yaliang Li, Nan Du, Xian Wu, Yusheng Xie, Shen Ge, Tao Yang, Kai Wang, Xingzheng Liang, and Wei Fan. On the generation of medical question-answer pairs. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 8822–8829, 2020. [9] Vishwajeet Kumar, Kireeti Boorla, Yogesh Meena, Ganesh Ramakrishnan, and Yuan Fang Li. Automating reading comprehension by generating question and answer pairs. In Advances in Knowledge Discovery and Data Mining: 22nd Pacific-Asia Conference, PAKDD 2018, Melbourne, VIC, Australia, June 3-6, 2018, Proceedings, Part III 22, pages 335–348. Springer, 2018. [10] Bang Liu, Haojie Wei, Di Niu, Haolan Chen, and Yancheng He. Asking questions the human way: Scalable question-answer generation from text corpus. In Proceedings of The Web Conference 2020, pages 2032–2043, 2020. [11] Sathish Reddy Indurthi, Dinesh Raghu, Mitesh M Khapra, and Sachindra Joshi. Gen erating natural language question-answer pairs from a knowledge graph using a rnn based question generation model. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, pages 376–385, 2017. [12] Shivani G Aithal, Abishek B Rao, and Sanjay Singh. Automatic question-answer pairs generation and question similarity mechanism in question answering system. Applied Intelligence, pages 1–14, 2021. 39 [13] Michael Heilman and Noah A Smith. Good question! statistical ranking for question generation. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pages 609–617, 2010. [14] Xuchen Yao, Gosse Bouma, and Yi Zhang. Semantics-based question generation and implementation. Dialogue & Discourse, 3(2):11–42, 2012. [15] Xinya Du, Junru Shao, and Claire Cardie. Learning to ask: Neural question generation for reading comprehension. arXiv preprint arXiv:1705.00106, 2017. [16] Iulian Vlad Serban, Alberto García-Durán, Caglar Gulcehre, Sungjin Ahn, Sarath Chandar, Aaron Courville, and Yoshua Bengio. Generating factoid questions with recurrent neural networks: The 30m factoid question-answer corpus. arXiv preprint arXiv:1603.06807, 2016. [17] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018. [18] Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, et al. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019. [19] Liam Dugan, Eleni Miltsakaki, Shriyash Upadhyay, Etan Ginsberg, Hannah Gonzalez, Dayheon Choi, Chuning Yuan, and Chris Callison-Burch. A feasibility study of answer unaware question generation for education. arXiv preprint arXiv:2203.08685, 2022. [20] Bingning Wang, Xiaochuan Wang, Ting Tao, Qi Zhang, and Jingfang Xu. Neural ques tion generation with answer pivot. In Proceedings of the AAAI conference on artificial intelligence, volume 34, pages 9138–9145, 2020. [21] Md Rezaul Karim, Sumon Kanti Dey, Tanhim Islam, Sagor Sarker, Mehadi Hasan Menon, Kabir Hossain, Md Azam Hossain, and Stefan Decker. Deephateexplainer: Explainable hate speech detection in under-resourced bengali language. In 2021 IEEE 8th Interna- 40 tional Conference on Data Science and Advanced Analytics (DSAA), pages 1–10. IEEE, 2021. [22] Tasmiah Tahsin Mayeesha, Abdullah Md Sarwar, and Rashedur M Rahman. Deep learning based question answering system in bengali. Journal of Information and Telecommunication, 5(2):145–178, 2021. [23] Abhik Bhattacharjee, Tahmid Hasan, Wasi Ahmad, Kazi Samin Mubasshir, Md Saiful Islam, Anindya Iqbal, M. Sohel Rahman, and Rifat Shahriyar. BanglaBERT: Language model pretraining and benchmarks for low-resource language understanding evalu ation in Bangla. In Findings of the Association for Computational Linguistics: NAACL 2022, pages 1318–1327, Seattle, United States, July 2022. Association for Computational Linguistics. [24] Mumenunnessa Keya, Abu Kaisar Mohammad Masum, Bhaskar Majumdar, Syed Akhter Hossain, and Sheikh Abujar. Bengali question answering system using seq2seq learn ing based on general knowledge dataset. In 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT), pages 1–6. IEEE, 2020. [25] Arnab Saha, Mirza Ifat Noor, Shahriar Fahim, Subrata Sarker, Faisal Badal, and Sajal Das. An approach to extractive bangla question answering based on bert-bangla and bquad. In 2021 International Conference on Automation, Control and Mechatronics for Industry 4.0 (ACMI), pages 1–6. IEEE, 2021. [26] David Baidoo-Anu and Leticia Owusu Ansah. Education in the era of generative artificial intelligence (ai): Understanding the potential benefits of chatgpt in promoting teaching and learning. Available at SSRN 4337484, 2023. [27] Abhik Bhattacharjee, Tahmid Hasan, Wasi Uddin Ahmad, and Rifat Shahriyar. Banglanlg: Benchmarks and resources for evaluating low-resource natural language generation in bangla. arXiv preprint arXiv:2205.11081, 2022. 41 [28] Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, and Colin Raffel. mT5: A massively multilingual pre-trained text-to-text transformer. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL): Human Language Technologies, pages 483–498, Online, June 2021. Association for Computational Linguistics. [29] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL): Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota, June 2019. Association for Computational Linguistics. [30] Zhilin Yang, Peng Qi, Saizheng Zhang, Yoshua Bengio, William Cohen, Ruslan Salakhut dinov, and Christopher D. Manning. HotpotQA: A dataset for diverse, explainable multi-hop question answering. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 2369–2380, Brussels, Belgium, October-November 2018. Association for Computational Linguistics. [31] Weihao Yu, Zihang Jiang, Yanfei Dong, and Jiashi Feng. Reclor: A reading compre hension dataset requiring logical reasoning. In International Conference on Learning Representations (ICLR), April 2020. [32] Guokun Lai, Qizhe Xie, Hanxiao Liu, Yiming Yang, and Eduard Hovy. RACE: Large-scale ReAding comprehension dataset from examinations. In Proceedings of the 2017 Confer ence on Empirical Methods in Natural Language Processing (EMNLP), pages 785–794, Copenhagen, Denmark, September 2017. Association for Computational Linguistics. [33] Tom Kwiatkowski, Jennimaria Palomaki, Olivia Redfield, Michael Collins, Ankur Parikh, Chris Alberti, Danielle Epstein, Illia Polosukhin, Jacob Devlin, Kenton Lee, Kristina Toutanova, Llion Jones, Matthew Kelcey, Ming-Wei Chang, Andrew M. Dai, Jakob Uszko- 42 reit, Quoc Le, and Slav Petrov. Natural questions: A benchmark for question answering research. Transactions of the Association for Computational Linguistics, 7:452–466, 2019. [34] Daniel Khashabi, Snigdha Chaturvedi, Michael Roth, Shyam Upadhyay, and Dan Roth. Looking beyond the surface: A challenge set for reading comprehension over multiple sentences. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL): Human Language Technologies, Volume 1 (Long Papers), pages 252–262, New Orleans, Louisiana, June 2018. Association for Computational Linguistics. [35] Siva Reddy, Danqi Chen, and Christopher D. Manning. CoQA: A conversational question answering challenge. Transactions of the Association for Computational Linguistics, 7:249–266, 2019. [36] Eunsol Choi, He He, Mohit Iyyer, Mark Yatskar, Wen-tau Yih, Yejin Choi, Percy Liang, and Luke Zettlemoyer. QuAC: Question answering in context. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 2174–2184, Brussels, Belgium, October-November 2018. Association for Computational Linguistics. [37] Md Asiful Haque, Shamima Sultana, Md Jayedul Islam, Md Ashraful Islam, and Je san Ahammed Ovi. Factoid question answering over bangla comprehension. In 2020 4th International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT), pages 1–8. IEEE, 2020. [38] Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. SQuAD: 100,000+ questions for machine comprehension of text. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 2383–2392, Austin, Texas, November 2016. Association for Computational Linguistics. [39] Thomas Scialom, Benjamin Piwowarski, and Jacopo Staiano. Self-attention architectures for answer-agnostic neural question generation. In Proceedings of the 57th annual meeting of the Association for Computational Linguistics, pages 6027–6032, 2019. 43 [40] Husam Ali, Yllias Chali, and Sadid A Hasan. Automatic question generation from sen tences. In Actes de la 17e conférence sur le Traitement Automatique des Langues Na turelles. Articles courts, pages 213–218, 2010. [41] Tahmid Hasan, Abhik Bhattacharjee, Kazi Samin, Masum Hasan, Madhusudan Basak, M. Sohel Rahman, and Rifat Shahriyar. Not low-resource anymore: Aligner ensembling, batch filtering, and new datasets for Bengali-English machine translation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 2612–2623, Online, November 2020. Association for Computational Linguistics. [42] Dheeru Dua, Yizhong Wang, Pradeep Dasigi, Gabriel Stanovsky, Sameer Singh, and Matt Gardner. DROP: A reading comprehension benchmark requiring discrete reasoning over paragraphs. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL): Human Language Technologies, Volume 1 (Long and Short Papers), pages 2368–2378, Minneapolis, Minnesota, June 2019. Association for Computational Linguistics. [43] Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014. [44] Elad Segal, Avia Efrat, Mor Shoham, Amir Globerson, and Jonathan Berant. A simple and effective model for answering multi-span questions. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 3074– 3080, Online, November 2020. Association for Computational Linguistics	en_US
dc.identifier.uri	http://hdl.handle.net/123456789/2153
dc.description	Supervised by Dr. Md. Azam Hossain, Assistant Professor, Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur-1704, Bangladesh	en_US
dc.description.abstract	High-resource languages, such as English, have access to a plethora of datasets with various question-answer types resembling real-world reading comprehension. However, there is a severe lack of diverse and comprehensive question-answering datasets in under-resourced languages like Bangla. The ones available are either translated versions of English datasets with a niche answer format or created by human annotations focusing on a specific domain, question type, or answer type. To address these limitations, we introduce BanglaRQA, a reading comprehension-based Bangla question-answering dataset with various question answer types. BanglaRQA consists of 3,000 context passages and 14,889 question-answer pairs created from those passages. The dataset comprises answerable and unanswerable questions covering four unique categories of questions and three types of answers. In addition, we also implemented four different Transformer models for question-answering on the proposed dataset. The best-performing model achieved an overall 62.42% EM and 78.11% F1 score. However, detailed analyses showed that the performance varies across question-answer types, leaving room for substantial improvement of the model performance. Furthermore, we demonstrated the effectiveness of BanglaRQA as a training resource by showing strong results on the bn_squad dataset. We focus on Bangla Question-answer pair generation for the next part of our work. Bangla, being a less explored language in NLP, lacks comprehensive research in the do main of question-answer pair generation. We focus on developing this untapped sector by fine-tuning BanglaT5, a generative model on the BanglaRQA dataset. The quality of the generated questions is first evaluated using various metrics. The best-performing model, BanglaT5, achieved a BLEU score of 21.56 and a BERT score of 85.04, indicating that the generated questions exhibit decent quality. Subsequently, the research progresses toward the main task of generating question-answer pairs. The quality of the generated pairs is evaluated through human assessment and baseline comparison, demonstrating that the generated QA pairs possess comparable quality to human-annotated QA pairs. Therefore, this work proposes an end-to-end Question-Answer-Generation (QAG) pipeline and presents a reading-comprehension-based dataset, that has the potential to contribute to future research	en_US
dc.language.iso	en	en_US
dc.publisher	Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur-1704, Bangladesh	en_US
dc.title	Answer-agnostic Bangla Question-answer Pair Generation Using Transformer-based Approaches	en_US
dc.type	Thesis	en_US