Classification of Stack Overflow Questions Based on Difficulty

Raida, Maliha Noushin; Sristy, Zannatun Naim; Monisha, Sheikh Moonwara Anjum; Ulfat, Nawshin

Classification of Stack Overflow Questions Based on Difficulty

Raida, Maliha Noushin; Sristy, Zannatun Naim; Monisha, Sheikh Moonwara Anjum; Ulfat, Nawshin

URI: http://hdl.handle.net/123456789/1780

Date: 2022-05-30

Abstract:

Technical question answering sites, like Stack Overflow, are gaining enormous attention from the learners and practitioners of specialized fields to exchange their programming knowledge. Question answering on different topics has engaged all levels of programmers. All the developers don’t have the same level of expertise, and the question differs among them in terms of complexity and context. However, the existing approach of Stack Overflow models primarily filters out the questions based on tags, which is inefficient for predicting the difficulty level. Due to the limitation of the process, a large part of these posts fails to attract the attention of appropriate users, resulting in valid questions having no answer or significant delay in response time. Therefore, to address these limitations, we proposed three different supervised models using TF-IDF, Topic Modeling(LDA), and Doc2Vec that build more complicated relationships by extracting context-dependent features between the user and the question. Each of the models builds an informative relationship that helps classify the difficulty of a question. Extensive experiments on different variations of the datasets demonstrate the improved efficacy of our proposed models over contemporary models. The experiments find out that even with limited information, the models performance scores are satisfactory and the Doc2Vec model outperforms the other models under consideration.

Description:

Supervised by Mr. Md. Jubair Ibna Mostafa; Lecturer Mr. Md. Nazmul Haque,Lecturer Department of Computer Science and Engineering(CSE), Islamic University of Technology (IUT) Board Bazar, Gazipur-1704, Bangladesh. This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2022.

Show full item record