Abstract:
The liver is one of the most important organs in the body. It is responsible for controlling the
chemical balance of the bloodstream as well as the removal of waste products among other vital
functions. Liver disease is important to be diagnosed early on as symptoms do not begin to show
until most of the liver is already damaged. Machine learning could be a crucial tool in the
prediction of liver disease in patients which could lead to early diagnosis and also early treatment.
In this study a dataset with 583 instances has been pre-processed and the imbalance had been
handled in 5 separate ways, namely, Synthetic Minority Oversampling Technique (SMOTE),
Adaptive Synthetic (ADASYN), Synthetic Minority Oversampling Technique and Conformal
Clustering (CC), Synthetic Minority Oversampling Technique and Tomeklinks and Synthetic
Minority Oversampling Technique and edited nearest neighbor (SMOTE+ENN). Then various
machine learning algorithms like Decision Tree Classifier, Logistic Regression, Gaussian Naïve
Bayes, Random Forest Classifier, K-Nearest Neighbors, and Support Vector Machine algorithms
etc has been used. The experiment gave the best result when SMOTE+ENN was used as the
imbalance handling technique with an accuracy of 98.37%. This accuracy was found using the
support vector machine (SVM) approach. Therefore, this study shows the comparative analysis of
the different imbalance handling techniques and the one which performs the best among each of
these. It presents SMOTE+ENN as the best in case of this specific dataset.
Description:
Supervised by
Mr. Mirza Muntasir Nishat,
Assistant Professor,
Department of Electrical and Electronics Engineering (EEE)
Islamic University of Technology (IUT)
Board Bazar, Gazipur-1704, Bangladesh