Abstract:
In the world of Geotechnical engineering or Foundation engineering, the properties of soil become crucial to
the all the processes involved in determining whether it is suitable for supporting a given structure. Hence it
is of essence that the significance of such parameters is predetermined to see which have the greatest effect.
The information era is increasingly influencing every industry. Machine Learning is undoubtedly one of the
most novel applications in forecasting soil parameters, and the integration of data and digital technologies
opens up a plethora of options in the geotechnical field.
The utilization of artificial intelligence as an inexpensive yet accurate model has become a bright prospect
since the efficiency of machine learning approaches has been proved for modeling various geotechnical
parameters including the soil shear strength (SSS). Despite this, conventional techniques of estimating soil
properties, which are both pricey and time intensive, are still utilized given the uncertainty around the
accuracy of the prediction models. Thus, a soil shear strength predictive formula is presented for use as an
alternative to the difficult traditional methods.
The focus of this research is to carry out comparative machine learning based Study of Shear Strength
Prediction of Soil through Correlation Analysis of Geotechnical Parameters and also to observe how the
previous models had fallen short and how to enhance the prediction of a parameter as important as soil shear
strength.
A total of 5 different machine learning models were applied in this particular endeavor, namely Support
Vector Machine (SVM), K Nearest Neighbor, Decision Tree and Ridge Regression which consisted of Linear
Regression and Lasso Regression.
We used a total of 164 boreholes data for our research work. We used two different location our soil type was
clay we used 67% of our data as training data and 33% of our data as testing data set.
The two evaluation metrics used in this paper were Root Mean Square Error (RMSE) and Mean Absolute
Error (MAE). Pearson Correlation Analysis was also carried out as it measures linear correlation between
two sets of data.
We also calculated our results in two ways, once before feature selection and another after feature selection.
For both of these processes, we attained diffeternt results. Among these two if we see at given charts below
we can see that after feature selection models performs better than before feature selection.
Keywords: Simple linear regression model, Support Vector Regression, Random Forest, Decision Tree,
RMSE, MAE, K Nearest Neighbor, Ridge Regression, Linear Regression and Lasso Regression.
Description:
Supervised by
Prof. Dr. Hossain Md. Shahin,
Department of Civil and Environmental Engineering(CEE),
Islamic University of Technology(IUT),
Board Bazar, Gazipur, Bangladesh.
This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Civil and Environmental Engineering, 2022.