Sentiment Analysis for Software Engineering; A Study on the Effectiveness of Data Augmentation and Ensembling using Transformer-based Models

Abid, Muhtasim; Tusar, Zubair Rahman; Sharfuddin, Sadat Bin

dc.contributor.author	Abid, Muhtasim
dc.contributor.author	Tusar, Zubair Rahman
dc.contributor.author	Sharfuddin, Sadat Bin
dc.date.accessioned	2023-03-16T09:36:44Z
dc.date.available	2023-03-16T09:36:44Z
dc.date.issued	2022-05-30
dc.identifier.citation	[1] W. Medhat, A. Hassan, and H. Korashy, “Sentiment analysis algorithms and applications: A survey,” Ain Shams engineering journal, vol. 5, no. 4, pp. 1093– 1113, 2014. [2] M. R. Wrobel, “Towards the participant observation of emotions in software development teams,” in 2016 Federated Conference on Computer Science and Information Systems (FedCSIS). IEEE, 2016, pp. 1545–1548. [3] M. R. Islam and M. F. Zibran, “Towards understanding and exploiting developers’ emotional variations in software engineering,” in 2016 IEEE 14th International Conference on Software Engineering Research, Management and Applications (SERA). IEEE, 2016, pp. 185–192. [4] S. F. Huq, A. Z. Sadiq, and K. Sakib, “Is developer sentiment related to software bugs: An exploratory study on github commits,” in 2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 2020, pp. 527–531. [5] M. Ortu, B. Adams, G. Destefanis, P. Tourani, M. Marchesi, and R. Tonelli, “Are bullies more productive? empirical study of affectiveness vs. issue fixing time,” in 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories. IEEE, 2015, pp. 303–313. [6] G. Fucci, N. Cassee, F. Zampetti, N. Novielli, A. Serebrenik, and M. Di Penta, “Waiting around or job half-done? sentiment in self-admitted technical debt,” in 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR). IEEE, 2021, pp. 403–414. [7] M. R. Islam and M. F. Zibran, “Sentistrength-se: Exploiting domain specificity for improved sentiment analysis in software engineering text,” Journal of Systems and Software, vol. 145, pp. 125–146, 2018. [8] ——, “Deva: sensing emotions in the valence arousal space in software engineering text,” in Proceedings of the 33rd annual ACM symposium on applied computing, 2018, pp. 1536–1543. 23 [9] T. Ahmed, A. Bosu, A. Iqbal, and S. Rahimi, “Senticr: a customized sentiment analysis tool for code review interactions,” in 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 2017, pp. 106–111. [10] F. Calefato, F. Lanubile, F. Maiorano, and N. Novielli, “Sentiment polarity detection for software development,” Empirical Software Engineering, vol. 23, no. 3, pp. 1352–1382, 2018. [11] T. Zhang, B. Xu, F. Thung, S. A. Haryono, D. Lo, and L. Jiang, “Sentiment analysis for software engineering: How far can pre-trained transformer models go?” in 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 2020, pp. 70–80. [12] S. Mishra and A. Sharma, “Crawling wikipedia pages to train word embeddings model for software engineering domain,” in 14th Innovations in Software Engineering Conference (formerly known as India Software Engineering Conference), 2021, pp. 1–5. [13] J. Wei and K. Zou, “Eda: Easy data augmentation techniques for boosting performance on text classification tasks,” arXiv preprint arXiv:1901.11196, 2019. [14] V. Carofiglio, F. d. Rosis, and N. Novielli, “Cognitive emotion modeling in natural language communication,” in Affective information processing. Springer, 2009, pp. 23–44. [15] J. A. Russell, “A circumplex model of affect.” Journal of personality and social psychology, vol. 39, no. 6, p. 1161, 1980. [16] M. R. Wrobel, “Emotions in the software development process,” in 2013 6th International Conference on Human System Interactions (HSI). IEEE, 2013, pp. 518–523. [17] M. R. Islam and M. F. Zibran, “Leveraging automated sentiment analysis in software engineering,” in 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR). IEEE, 2017, pp. 203–214. [18] B. Lin, F. Zampetti, G. Bavota, M. Di Penta, M. Lanza, and R. Oliveto, “Sentiment analysis for software engineering: How far can we go?” in Proceedings of the 40th international conference on software engineering, 2018, pp. 94–104. [19] H. Batra, N. S. Punn, S. K. Sonbhadra, and S. Agarwal, “Bert-based sentiment analysis: A software engineering perspective,” in International Conference on Database and Expert Systems Applications. Springer, 2021, pp. 138–148. 24 [20] M. Thelwall, K. Buckley, G. Paltoglou, D. Cai, and A. Kappas, “Sentiment strength detection in short informal text,” Journal of the American society for information science and technology, vol. 61, no. 12, pp. 2544–2558, 2010. [21] R. Socher, A. Perelygin, J.Wu, J. Chuang, C. D. Manning, A. Y. Ng, and C. Potts, “Recursive deep models for semantic compositionality over a sentiment treebank,” in Proceedings of the 2013 conference on empirical methods in natural language processing, 2013, pp. 1631–1642. [22] M. Pennacchiotti and A.-M. Popescu, “Democrats, republicans and starbucks afficionados: user classification in twitter,” in Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, 2011, pp. 430–438. [23] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018. [24] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, “Roberta: A robustly optimized bert pretraining approach,” arXiv preprint arXiv:1907.11692, 2019. [25] Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. R. Salakhutdinov, and Q. V. Le, “Xlnet: Generalized autoregressive pretraining for language understanding,” Advances in neural information processing systems, vol. 32, 2019. [26] L. Villarroel, G. Bavota, B. Russo, R. Oliveto, and M. Di Penta, “Release planning of mobile apps based on user reviews,” in 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE). IEEE, 2016, pp. 14–24. [27] D. Pletea, B. Vasilescu, and A. Serebrenik, “Security and emotion: sentiment analysis of security discussions on github,” in Proceedings of the 11th working conference on mining software repositories, 2014, pp. 348–351. [28] N. Novielli, F. Calefato, D. Dongiovanni, D. Girardi, and F. Lanubile, “Can we use se-specific sentiment analysis tools in a cross-platform setting?” in Proceedings of the 17th International Conference on Mining Software Repositories, 2020, pp. 158–168. 25	en_US
dc.identifier.uri	http://hdl.handle.net/123456789/1779
dc.description	Supervised by Mr. Md. Jubair Ibna Mostafa, and Md. Nazmul Haque, This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2022.	en_US
dc.description.abstract	Sentiment analysis for software engineering has undergone much research to efficiently develop tools and approaches to classify sentiment polarity for software engineering contents. It started with customized tools based on lexicon and supervised approaches like SentiStrength-SE, SentiCR, and Senti4SD. Pre-trained transformer-based models like BERT, RoBERTa, and XLNet have later outperformed the tools. These models give an improved classification of sentiment polarities for software engineering content when fine-tuned on SE-specific datasets. Although the performance of these models is much better than previously existing tools, there is still much room for improvement, and that is what we have demonstrated in this work. We use three pre-trained transformer-based models on four gold-standard SE-specific datasets and ensemble the models to show the improvement of the ensemble approach over the individual pre-trained transformer-based models. We use two key metrics to assess performance: weighted-average F1 scores and macro-average F1 scores. We also apply text augmentation on the datasets that have some issues like small size and class imbalance and then evaluate the performance of our approaches on the augmented datasets as well. Our results show that the ensemble models outperform the pre-trained transformer-based models on the original datasets and that data augmentation further improve the performance of all the approaches used in the work.	en_US
dc.language.iso	en	en_US
dc.publisher	Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur, Bangladesh	en_US
dc.subject	Sentiment Analysis, Pre-Trained Transformer-based Models, Ensembling, Data Augmentation	en_US
dc.title	Sentiment Analysis for Software Engineering; A Study on the Effectiveness of Data Augmentation and Ensembling using Transformer-based Models	en_US
dc.type	Thesis	en_US