A Transformer Based Approach for Classifying Software Requirements

Hoque, Maheen Mashrur; Habib, Ahsan; Khan, Arman Hossain Dipu

dc.contributor.author	Hoque, Maheen Mashrur
dc.contributor.author	Habib, Ahsan
dc.contributor.author	Khan, Arman Hossain Dipu
dc.date.accessioned	2025-05-29T04:29:55Z
dc.date.available	2025-05-29T04:29:55Z
dc.date.issued	2024-07-30
dc.identifier.citation	[1] Daniel Siahaan and Brian Rizqi Paradisiaca Darnoto. A novel framework to detect irrelevant software requirements based on multiphilda as the topic model. In Informatics, volume 9, page 87. MDPI, 2022. [2] Vladimir Ivanov, Andrey Sadovykh, Alexandr Naumchev, Alessandra Bagnato, and Kirill Yakovlev. Extracting software requirements from unstructured documents. In International Conference on Analysis of Images, Social Networks and Texts, pages 17–29. Springer, 2021. [3] Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108, 2019. [4] Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692, 2019. [5] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017. [6] Hadeer Adel, Abdelghani Dahou, Alhassan Mabrouk, Mohamed Abd Elaziz, Mohammed Kayed, Ibrahim Mahmoud El-Henawy, Samah Alshathri, and Abdelmgeid Amin Ali. Improving crisis events detection using distilbert with hunger games search algorithm. Mathematics, 10(3):447, 2022. [7] Klaus Pohl. Requirements engineering fundamentals: a study guide for the certified professional for requirements engineering exam-foundation level- IREB compliant. Rocky Nook, Inc., 2016. [8] Ian Sommerville. Software Engineering, 9/E. Pearson Education India, 2011. 39 [9] Karl E Wiegers and Joy Beatty. Software requirements. Pearson Education, 2013. [10] Bashar Nuseibeh and Steve Easterbrook. Requirements engineering: a roadmap. In Proceedings of the Conference on the Future of Software Engineering, pages 35–46, 2000. [11] Idza Aisara Norabid and Fariza Fauzi. Rule-based text extraction for multimodal knowledge graph. International Journal of Advanced Computer Science and Applications, 13(5), 2022. [12] Sadeen Alharbi. Ambiguity detection in requirements classification task using fine-tuned transformation technique. In CS & IT Conference Proceedings, volume 12. CS & IT Conference Proceedings, 2022. [13] Dewi Mairiza, Didar Zowghi, and Nurie Nurmuliani. Managing conflicts among non-functional requirements. In Australian Workshop on Requirements Engineering. University of Technology, Sydney, 2009. [14] Garima Malik, Mucahit Cevik, Devang Parikh, and Ayse Basar. Identifying the requirement conflicts in srs documents using transformer-based sentence embeddings. arXiv preprint arXiv:2206.13690, 2022. [15] Minseong Kim, Sooyong Park, Vijayan Sugumaran, and Hwasil Yang. Managing requirements conflicts in software product lines: A goal and scenario based approach. Data & Knowledge Engineering, 61(3):417–432, 2007. [16] Alessio Ferrari, Giorgio Oronzo Spagnolo, and Stefania Gnesi. Pure: A dataset of public requirements documents. In 2017 IEEE 25th International Requirements Engineering Conference (RE), pages 502–505. IEEE, 2017. [17] Michael Schmidt. Implementing the IEEE software engineering standards. Sams, 2000. [18] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018. [19] Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020. [20] Justyna Sarzynska-Wawer, AleksanderWawer, Aleksandra Pawlak, Julia Szymanowska, Izabela Stefaniak, Michal Jarkiewicz, and Lukasz Okruszek. Detecting formal thought disorder by deep contextualized word representations. Psychiatry Research, 304:114135, 2021. [21] Leo Breiman. Random forests. Machine learning, 45:5–32, 2001. [22] Corinna Cortes and Vladimir Vapnik. Support-vector networks. Machine learning, 20:273–297, 1995. [23] Harris Drucker, Donghui Wu, and Vladimir N Vapnik. Support vector machines for spam categorization. IEEE Transactions on Neural networks, 10 (5):1048–1054, 1999. [24] Kentaro Torisawa et al. A new perceptron algorithm for sequence labeling with non-local features. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pages 315–324, 2007. [25] Marina Sokolova and Guy Lapalme. A systematic analysis of performance measures for classification tasks. Information processing & management, 45 (4):427–437, 2009.	en_US
dc.identifier.uri	http://hdl.handle.net/123456789/2408
dc.description	Supervised by Dr. Hasan Mahmud Associate Professor, Department of Computer Science and Engineering, Islamic University of Technology (IUT), Board Bazar, Gazipur-1704, Bangladesh This thesis is submitted in partial fulfilment of the requirements for the degree of B.Sc. in Software Engineering(SWE)	en_US
dc.description.abstract	Optimizing and automation of the software development lifecycle (SDLC) has been a challenge in the domain of software development for a while. One of the main challenges in automation of the SDLC lies within the work of collecting, analyzing and establishing requirements. Requirements that are collected from the clients are unstructured and often noisy, which takes substantial amount of effort to sort out and clarify to create a well organized Specification Document (SRS Document). To address this challenge, we propose a transformer based ensemble approach to classify the requirements so that the work of establishing requirements from client descriptions can be significantly accelerated. Our approach involves adopting the DistilBERT and RoBERTa models, which are known for their lightweight performance and combine them with ensemble models for classification. One of our considerations was the cost of computing resources, which is why we chose lightweight transformer models that are not as resource intensive like some other state of the art language models. Our experiments compares our approach with out of the box classifiers that are available with DistilBERT and RoBERTa as well as state of the art GPT model. The current experiment shows that our proposed approach outperformed out of the box classification capabilities offered by DistilBERT and RoBERTa in terms of accuracy, which is a widely adopted metric in terms of classification tasks. Our work aims to contribute to the development of lightweight classifiers that would aid in the task of classifying software requirements. Furthermore, our study provides insights to how the state of the art GPT models perform in this task of classifying software requirements. We hope our work will contribute to the future advent of automation of the SDLC.	en_US
dc.language.iso	en	en_US
dc.publisher	Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur-1704, Bangladesh	en_US
dc.subject	natural language processing; software requirements; sequence classification; ensemble classification models; transformers;	en_US
dc.title	A Transformer Based Approach for Classifying Software Requirements	en_US
dc.type	Thesis	en_US