Abstract:
Bangla speech recognition is a relatively young area of research and we have not
seen much success so far. Pattern recognition approach is generally used for
speech recognition. CMUSphinx is a framework which uses Hidden Markov Model
(HMM) for pattern training and n-gram technique to build a language model from
the speech corpus which can be handy for building speech recognition systems.
The success of speech recognition mostly depends on the speech corpus and a
well-trained acoustic model. Such a speech recognizer, implemented in mobile
devices, can have tremendous implication on our day to day life. However, to build
an efficient acoustic model we need an extensive amount of training data. In this
thesis work, we have shown how CMUSphinx can be used to build an acoustic
model for Bangla. We have built several acoustic models and tried to improve the
accuracy rate. One of our trained models has achieved good accuracy rate. In the
latter part of the thesis, we implemented the speech recognizer in Android
platform. In this process, we investigated some problems those have to be solved
to get comparable accuracy rate in Android. We have also proposed a model for
the future continuation of the research.
Description:
Supervised by
Md. Mohiuddin Khan,
Assistant Professor,
Co-Supervisor,
Moin Mahmud Tanvee,
Lecturer,
Computer Science and Engineering (CSE),
Islamic University of Technology (IUT),
Board Bazar, Gazipur-1704. Bangladesh.