Abstract:
Question answering (QA) is a field within natural language processing (NLP) that
focuses on developing systems capable of automatically answering questions posed
in human language. QA systems aim to understand the meaning and intent behind
questions and provide accurate and relevant answers by leveraging large corpora of
text data. Short Answer Questioning (SQA) is a specific type of question answering
task within natural language processing (NLP) that focuses on generating concise and
precise answers to fact-based questions. Unlike traditional QA systems that generate
longer, descriptive answers, SQA systems aim to extract short snippets of information
directly related to the question. These systems employ techniques such as text com prehension, named entity recognition, and information retrieval to identify the most
relevant information and produce brief and accurate responses. SQA finds applications
in areas such as search engines, voice assistants, and chatbots, where quick and concise
answers are desired.
In our thesis, we propose BTSQA, an architecture to perform spoken question
answering. We have built the architecture with one general QA model and one ASR
model. Then we added a word correction step to improve the performance. Initially the
general QA model,T5 transformer model, was used to with F1 score of 73.37%. We
used audio dataset on the whisper ASR model with WER of 31.58% and Wev2Vec2
model with WER score 29.64%. When we combined the general QA model with ASR
model using word correction the performance F1 score was 53.65%. This models were
run on the text dataset we built and they were transferred in audio using GoogleTTS
Description:
Supervised by
Mr. Md. Mezbaur Rahman,
Lecturer,
Department of Computer Science and Engineering(CSE),
Islamic University of Technology(IUT),
Board Bazar, Gazipur-1704, Bangladesh