Abstract:
This thesis presents the development of a sophisticated tool designed to automatically
annotate and classify privacy-related user reviews from the Google Play Store and a
novel dataset. With privacy concerns becoming increasingly significant in the digital
age, our tool aims to streamline the process of identifying and categorizing privacy-
related issues and suggestions from user feedback. Through extensive experimenta-
tion, we found that ensemble models, particularly those incorporating Random Forest
classifiers, outperformed transformer-based models in accurately identifying and cat-
egorizing these privacy issues. The tool demonstrated robust performance even with
shorter datasets, indicating its potential applicability in real-world scenarios. Addi-
tionally, the study highlights the importance of tailored data augmentation techniques
for different machine learning algorithms. Our findings suggest that integrating this
tool can provide developers with actionable insights to enhance the privacy aspects of
their applications. Future research can explore the use of larger datasets and further
optimization of data augmentation strategies to improve model performance.
Description:
Supervised by
Mr. Shohel Ahmed,
Assistant Professor,
Department of Computer Science and Engineering (CSE)
Islamic University of Technology (IUT)
Board Bazar, Gazipur, Bangladesh
This thesis is submitted in partial fulfillment of the requirement for the degree of Bachelor of Science in Software Engineering, 2024