Anomaly Detection System in Industrial Control System using Machine Learning

Show simple item record

dc.contributor.author Nabil, Ahammed Sakir
dc.contributor.author Rahman, Ahnaf Akif
dc.contributor.author Ahmed, Imtihan
dc.date.accessioned 2023-04-13T08:16:59Z
dc.date.available 2023-04-13T08:16:59Z
dc.date.issued 2022-05-30
dc.identifier.citation [1] K. Paridari, N. O’Mahony, A. El-Din Mady, R. Chabukswar, M. Boubekeur, and H. Sandberg, “A framework for attack-resilient industrial control systems: Attack detection and controller reconfiguration,” Proc. IEEE Inst. Electr. Electron. Eng., vol. 106, no. 1, pp. 113–128, 2018. [2] K. Stouffer, V. Pillitteri, S. Lightman, M. Abrams, and A. Hahn, Guide to Industrial Control Systems (ICS) Security Supervisory Control and Data Acquisition (SCADA) systems Distributed Control Systems (DCS) and other control system configurations such as Programmable Logic Controllers (PLC) Special Publication 800-82. Gaithersburg, MD, 2015. [3] B. Filkins, D. Wylie, and A. J. Dely, “Sans 2019 state of ot/ics cybersecurity survey,” in SANSTM Institute, 2019. [4] M. Conti, D. Donadel, and F. Turrin, “A survey on industrial control system testbeds and datasets for security research,” IEEE Commun. Surv. Tutor., vol. 23, no. 4, pp. 2248–2294, 2021. [5] S. Mokhtari, A. Abbaspour, K. K. Yen, and A. Sargolzaei, “A machine learning approach for anomaly detection in industrial control systems based on measurement data,” Electronics (Basel), vol. 10, no. 4, p. 407, 2021. [6] B. Miller and D. Rowe, “A survey SCADA of and critical infrastructure incidents,” in Proceedings of the 1st Annual conference on Research in information technology - RIIT ’12, 2012. [7] T. Armerding, “Throwback Thursday: Whatever Happened to Stuxnet?,” Synopsys, 2019. [8] S. Shrivastava, “Blackenergy-malware for cyber-physical attacks,” Singapore, vol. 74, 2016. [9] A. D. Pinto, Y. Dragoni, A. Carcano, and “. Triton, The First ICS Cyber Attack on Safety Instrument Systems Understanding the Malware, Its Communications and Its OT Payload. Black Hat USA, 2018. [10] Kaspersky.com. [Online]. Available: https://icscert.kaspersky.com/media/KASPERSKY_H1_2020_ICS_REPORT_EN.pdf. [Accessed: 16-Feb-2022]. 36 [11] R. D. S. Raizada and Y.-S. Lee, “Smoothness without smoothing: why Gaussian naive Bayes is not naive for multi-subject searchlight studies,” PLoS One, vol. 8, no. 7, p. e69566, 2013. [12] G. Guo, H. Wang, D. Bell, Y. Bi, and K. Greer, “KNN model-based approach in classification,” in On The Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE, Berlin, Heidelberg: Springer Berlin Heidelberg, 2003, pp. 986–996. [13] A. Narzullaev, Z. Muminov, and M. Narzullaev, “Wi-Fi based student attendance recording system using logistic regression classification algorithm,” in INTERNATIONAL UZBEKISTAN-MALAYSIA CONFERENCE ON “COMPUTATIONAL MODELS AND TECHNOLOGIES (CMT2020)”: CMT2020, 2021. [14] P. Schober and T. R. Vetter, “Logistic regression in medical research,” Anesth. Analg., vol. 132, no. 2, pp. 365–366, 2021. [15] A. Sarica, A. Cerasa, and A. Quattrone, “Random forest algorithm for the classification of neuroimaging data in Alzheimer’s disease: A systematic review,” Front. Aging Neurosci., vol. 9, 2017. [16] “eXtreme Gradient Boosting (XGBoost): Better than random forest or gradient boosting,” Github.io. [Online]. Available: https://liuyanguu.github.io/post/2018/07/09/extreme-gradient-boosting-xgboostbetter-than-random-forest-or-gradient-boosting/. [Accessed: 14-Feb-2022]. [17] T. Chen and C. Guestrin, “Xgboost: A scalable tree boosting system,” in Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, 2016, pp. 785–794. [18] S. Abirami and P. Chitra, “Energy-efficient edge based real-time healthcare support system,” in Advances in Computers, Elsevier, 2020, pp. 339–368. [19] J. Kim, H. Choi, J. Shin, and J. T. Seo, “Study on anomaly detection technique in an industrial control system based on machine learning,” in Proceedings of the 2020 ACM International Conference on Intelligent Computing and its Emerging Applications, 2020 [20] H. Shin, W. Lee, J. Yun, and H. Kim, “HAI 1.0: HIL-based Augmented ICS Security Dataset,” in 13 USENIX Workshop on Cyber Security Experimentation and Test, 2020. 37 [21] W. Choi, K. Joo, H. J. Jo, M. C. Park, and D. H. Lee, “Voltageids: Low-level communication characteristics for automotive intrusion detection system,” IEEE Trans. Inf. Forensics Secur, vol. 13, pp. 2114–2129, 2018. [22] J. Tai, I. Alsmadi, Y. Zhang, and F. Qiao, “Machine Learning Methods for Anomaly Detection in Industrial Control Systems,” in 2020 IEEE International Conference on Big Data, IEEE, 2020, pp. 2333–2339. [23] S. Zhong, S. Fu, L. Lin, X. Fu, Z. Cui, and R. Wang, “A novel unsupervised anomaly detection for gas turbine using isolation forest,” in IEEE International Conference on Prognostics and Health Management (ICPHM), San Francisco, CA, USA, 2019, pp. 1–6. [24] S. Ahmed, L. Youngdoo, and I. Seung-Ho Hyun, Unsupervised Machine LearningBased Detection of Covert Data Integrity Assault in Smart Grid Networks Utilizing Isolation Forest”. IEEE, 2019. [25] Y. K. Younghwan Kim and H. K. K. Younghwan Kim, “Cluster-based deep oneclass classification model for anomaly detection,” J. Internet Technol., vol. 22, no. 4, pp. 903–911, 2021. [26] J. Kim, H. Choi, J. Shin, and J. T. Seo, “Study on anomaly detection technique in an industrial control system based on machine learning,” in Proceedings of the 2020 ACM International Conference on Intelligent Computing and its Emerging Applications, 2020. [27] H. Kim and Y.-M. Kim, “Abnormal Detection for Industrial Control Systems Using Ensemble Recurrent Neural Networks Model,” Journal of the Korea Institute of Information Security & Cryptology, vol. 31, no. 3, pp. 401–410, 2021. [28] H. K. Shin, W. Lee, J. H. Yun, and H. Kim, “Implementation of programmable CPS testbed for anomaly detection,” in 12th USENIX Workshop on Cyber Security Experimentation and Test, 2019. [29] W. S. Hwang, J. H. Yun, J. Kim, and H. C. Kim, “Time-series aware precision and recall for anomaly detection: considering variety of detection result and addressing ambiguous labeling,” in Proceedings of the 28th ACM International Conference, 2019. [30] D. Chicco and G. Jurman, “The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation,” BMC Genomics, vol. 21, no. 1, p. 6, 2020. 38 [31] U. Bhowan, M. Johnston, and M. Zhang, “Evolving ensembles in multi-objective genetic programming for classification with unbalanced data,” in Proceedings of the 13th annual conference on Genetic and evolutionary computation - GECCO ’11, 2011. [32] A. P. Bradley, R. P. W. Duin, P. Paclik, and T. C. W. Landgrebe, “Precision-recall operating characteristic (P-ROC) curves in imprecise environments,” in 18th International Conference on Pattern Recognition, 2006. [33] Z. C. Lipton, C. Elkan, and B. Narayanaswamy, “Thresholding classifiers to maximize F1 score,” arXiv [stat.ML], 2014. [34] Y. Baştanlar and M. Özuysal, Introduction to machine learning. miRNomics: MicroRNA biology and computational analysis. 2014. [35] A. J. Bowers and X. Zhou, “Receiver operating characteristic (ROC) area under the curve (AUC): A diagnostic measure for evaluating the accuracy of predictors of education outcomes,” J. Educ. Stud. Placed Risk, vol. 24, no. 1, pp. 20–46, 2019. [36] M. Kubat and S. Matwin, Addressing the curse of imbalanced training sets: onesided selection. 1997. [37] Y. Liu, J. Cheng, C. Yan, X. Wu, and F. Chen, “Research on the Matthews correlation coefficients metrics of personalized recommendation algorithm evaluation,” Int. J. Hybrid Inf. Technol., vol. 8, no. 1, pp. 163–172, 2015. [38] X. Bian, Detecting Anomalies in Time-Series Data using Unsupervised Learning and Analysis on Infrequent Signatures. 202 en_US
dc.identifier.uri http://hdl.handle.net/123456789/1842
dc.description Supervised by Mr. Safayat Bin Hakim, Assistant Professor, Department of Electrical and Electronic Engineering (EEE), Islamic University of Technology (IUT), Board Bazar, Gazipur-1704, Bangladesh. This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Electrical and Electronic Engineering, 2022. en_US
dc.description.abstract An industry is composed of various types of machines and instruments interconnected through a system of network performing in harmony following specific instructions assigned to specific nodes or equipment. Industrial control system refers to the whole environment that keeps everything included in the industrial system in order. Like any other system, industrial control system is also prone to attacks which might result in massive loss. In this paper, six machine learning algorithms have been applied for detecting the presence of anomaly in industrial control system using HIL-based Augmented ICS (HAI 21.03) Security Dataset. The dataset has been analyzed using analysis of variance to extract 50 of the most important features from each sample in the dataset. All the machine learning models' performances are recorded, and a full comparative analysis for hyperparameter optimization, downsampling-upsampling with hyperparameter tuning, and without hyperparameter tweaking is shown. Random search cross validation has been employed for hyperparameter optimization, and synthetic minority oversampling technique has been used for upsampling. In terms of several evaluation metrics like accuracy, recall, precision, F1-score, Receiver Operating Characteristic (ROC) Area Under the Curve (AUC) and specificity, satisfactory performances have been observed. In addition to these evaluation metrics, which have also been used by other researchers in previous studies, we have evaluated the performance of our models using Geometric Mean(G-Mean) and Matthews Correlation Coefficient (MCC), which are considered two of the most important evaluation metrics in imbalanced datasets. Using our proposed approach, a maximum recall score of 99.77% and an F1-score of 99.50% have been achieved, which are significantly higher than previous studies. Maximum G-Mean of 99.89% and MCC of 0.9950 have been obtained by the application of K-Nearest Neighbors (KNN) model. Therefore, our proposed approach has the prospect to be an efficient method for detecting anomalies in industrial control systems and taking appropriate actions. en_US
dc.language.iso en en_US
dc.publisher Department of Electrical and Electronic Engineering, Islamic University of Technology (IUT) The Organization of Islamic Cooperation (OIC) Board Bazar, Gazipur-1704, Bangladesh en_US
dc.subject Machine Learning, Industrial Control System, Anomaly Detection System, Threat Detection System, KNN, MLP, ANOVA en_US
dc.title Anomaly Detection System in Industrial Control System using Machine Learning en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search IUT Repository


Advanced Search

Browse

My Account

Statistics