Abstract:
Online health consultations are becoming increasingly popular as a way for patients to
discuss their medical health inquiries. In Bangladesh, patients are also using online health
care solutions and thus providing medical queries in Bangla language. The COVID-19
pandemic has accelerated the use of these platforms, leading to a significant influx of
questions and placing a heavy burden on the limited number of healthcare professionals available to respond. Text summarization offers a promising solution by condensing
Bangla medical queries to highlight only the essential information needed for answers.
This not only reduces the time healthcare professionals spend parsing unnecessary details
but also serves as a crucial step toward developing automated medical question-answering
systems. This research presents a comprehensive zero-shot evaluation of several state-ofthe-art Bangla and multilingual text generation models on the task of summarizing Bangla Consumer Health Questions (CHQs). The models we evaluated include BanglaT5, mT5, GPT-3.5, and GPT-4. The evaluation was conducted using ‘BanglaCHQ-Summ,’ which is
currently the only available dataset specifically designed for summarizing Bangla CHQs,
comprising 2350 pairs of questions and their corresponding summaries. The study aimed
to determine which model performs best in terms of accurately and concisely summarizing Bangla medical queries. Among the models tested, GPT-4 demonstrated superior
performance, achieving a BERTScore of 90.25%
Description:
Supervised by
Dr. Md Moniruzzaman,
Assistant Professor,
Department of Computer Science and Engineering (CSE)
Islamic University of Technology (IUT)
Board Bazar, Gazipur, Bangladesh
This thesis is submitted in partial fulfillment of the requirement for the degree of Bachelor of Science in Software Engineering, 2024