Abstract:
Technical question answering sites, like Stack Overflow, are gaining enormous attention from the learners and practitioners of specialized fields to exchange their programming knowledge. Question answering on different topics has engaged all levels
of programmers. All the developers don’t have the same level of expertise, and the
question differs among them in terms of complexity and context. However, the existing approach of Stack Overflow models primarily filters out the questions based
on tags, which is inefficient for predicting the difficulty level. Due to the limitation
of the process, a large part of these posts fails to attract the attention of appropriate
users, resulting in valid questions having no answer or significant delay in response
time. Therefore, to address these limitations, we proposed three different supervised
models using TF-IDF, Topic Modeling(LDA), and Doc2Vec that build more complicated relationships by extracting context-dependent features between the user and the
question. Each of the models builds an informative relationship that helps classify the
difficulty of a question. Extensive experiments on different variations of the datasets
demonstrate the improved efficacy of our proposed models over contemporary models.
The experiments find out that even with limited information, the models performance
scores are satisfactory and the Doc2Vec model outperforms the other models under
consideration.
Description:
Supervised by
Mr. Md. Jubair Ibna Mostafa; Lecturer
Mr. Md. Nazmul Haque,Lecturer
Department of Computer Science and Engineering(CSE),
Islamic University of Technology (IUT)
Board Bazar, Gazipur-1704, Bangladesh.
This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2022.