Fake News Detection in Bengali Language using Transfer Learning Approach

Show simple item record

dc.contributor.author Sadat, S M Abu Rafi
dc.contributor.author Srijon, Al Muhtasim
dc.contributor.author Dihan, Hasnaine Ahmed
dc.date.accessioned 2024-01-18T05:39:50Z
dc.date.available 2024-01-18T05:39:50Z
dc.date.issued 2023-05-30
dc.identifier.citation [1] M. Z. Hossain, M. A. Rahman, M. S. Islam, and S. Kar, “BanFakeNews: A dataset for detecting fake news in Bangla,” in Proceedings of the Twelfth Language Resources and Evaluation Conference. Marseille, France: European Language Resources Association, May 2020, pp. 2862–2871. [Online]. Available: https://aclanthology.org/2020.lrec-1.349 [2] A. Conneau, K. Khandelwal, N. Goyal, V. Chaudhary, G. Wenzek, F. Guzmán, E. Grave, M. Ott, L. Zettlemoyer, and V. Stoyanov, “Unsupervised cross-lingual representation learning at scale,” CoRR, vol. abs/1911.02116, 2019. [Online]. Available: http://arxiv.org/abs/1911.02116 [3] C.-F. Chen, Q. Fan, and R. Panda, “Crossvit: Cross-attention multi-scale vision transformer for image classification,” 2021. [4] N. R. Bethan Staton, “Donald trump catchphrase ’fake news’ wins word of the year,” Nov. 2, 2017, [Feb. 21, 2023][Online]. [Online]. Available: https://news.sky.com/story/ donald-trump-catchphrase-fake-news-wins-word-of-the-year-11109333 [5] M. Martínez, “Burned to death because of a rumour on whatsapp,” (Nov. 12, 2018), [Accessed Nov. 12, 2022][Online]. [Online]. Available: https://www.bbc.com/news/world-latin-america-46145986 [6] P. Mozur, “A genocide incited on facebook, with posts from myanmar’s military,” (Oct. 15, 2018), [Oct. 15, 2022][Online]. [Online]. Available: https://www.nytimes.com/2018/10/15/technology/ myanmar-facebook-genocide.html 33 BIBLIOGRAPHY 34 [7] Management and R. D. Initiative, “News literacy in bangladesh, national survey,” (2020), [Jan. 11, 2023][Online]. [Online]. Available: https://mrdibd.org/wp-content/uploads/2021/07/ News-Literacy-in-Bangladesh-National-Survey.pdf [8] R. Team, “Rumor: Thankuni will prevent coronavirus,” (18 March, 2020), [Jan. 11, 2023][Online]. [Online]. Available: https://rumorscanner.com/ fact-check/archives/750 [9] R. Rafe, “Bangladesh: Fake news on facebook fuels violence,” (11 Jan. 2019), [Jan. 11, 2023][Online]. [Online]. Available: https://www.dw.com/ en/bangladesh-fake-news-on-facebook-fuels-communal-violence/a-51083787 [10] B. R. COX’S BAZAR, “Muslim protesters torch buddhist tem ples, homes in bangladesh,” (30 Sep. 2012), [Jan. 11, 2023][Online]. [Online]. Available: https://www.reuters.com/article/ uk-bangladesh-temples-idUKBRE88T03Q20120930 [11] B. F. Check, “Bd fact check,” [Feb. 21, 2023][Online]. [Online]. Available: https://bdfactcheck.com/ [12] Jachai, “Jachai,” [Feb. 21, 2023][Online]. [Online]. Available: https: //www.jachai.org/ [13] rumor scanner, “Rumor scanner bd,” [Feb. 21, 2023][Online]. [Online]. Available: https://rumorscanner.com/ [14] K. Shu, A. L. Sliva, S. Wang, J. Tang, and H. Liu, “Fake news detection on social media: A data mining perspective,” ArXiv, vol. abs/1708.01967, 2017. [15] H. Rashkin, E. Choi, J. Y. Jang, S. Volkova, and Y. Choi, “Truth of varying shades: Analyzing language in fake news and political fact-checking,” in Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Copenhagen, Denmark: Association for Computational Linguistics, Sep. 2017, pp. 2931–2937. [Online]. Available: https://aclanthology.org/D17-1317 BIBLIOGRAPHY 35 [16] O. Sen, M. Fuad, M. N. Islam, J. Rabbi, M. Masud, M. K. Hasan, M. Awal, A. Fime, M. T. Fuad, D. Sikder, and A. R. Iftee, “Bangla natural language processing: A comprehensive analysis of classical, machine learning, and deep learning based methods,” IEEE Access, vol. 10, pp. 1–1, 01 2022. [17] S. Sazzed, “Cross-lingual sentiment classification in low-resource bengali lan guage,” 01 2020, pp. 50–60. [18] S. Ruíz, E. Providel, and M. Mendoza, “Fake news detection via english to-spanish translation: Is it really useful?” in Social Computing and Social Media: Experience Design and Social Network Analysis, G. Meiselwitz, Ed. Cham: Springer International Publishing, 2021, pp. 136–148. [19] A. Anjum, M. Keya, A. K. Mohammad Masum, and S. R. Haider Noori, “Fake and authentic news detection using social data strivings,” in 2021 12th International Conference on Computing Communication and Network ing Technologies (ICCCNT), 2021, pp. 1–5. [20] S. B. S. Mugdha, S. M. Ferdous, and A. Fahmin, “Evaluating machine learn ing algorithms for bengali fake news detection,” in 2020 23rd International Conference on Computer and Information Technology (ICCIT), 2020, pp. 1–6. [21] F. Islam, M. M. Alam, S. M. Shahadat Hossain, A. Motaleb, S. Yeasmin, M. Hasan, and R. M. Rahman, “Bengali fake news detection,” in 2020 IEEE 10th International Conference on Intelligent Systems (IS), 2020, pp. 281– 287. [22] “Detection of fake news using deep learning cnn–rnn based methods,” ICT Express, vol. 8, no. 3, pp. 396–408, 2022. [23] R. K. Kaliyar, A. Goswami, and P. Narang, “Fakebert: Fake news detection in social media with a bert-based deep learning approach,” Multimedia Tools Appl., vol. 80, no. 8, p. 11765–11788, mar 2021. [Online]. Available: https://doi.org/10.1007/s11042-020-10183-2 BIBLIOGRAPHY 36 [24] R. I. Rasel, A. H. Zihad, N. Sultana, and M. M. Hoque, “Bangla fake news detection using machine learning, deep learning and transformer models,” in 2022 25th International Conference on Computer and Information Technol ogy (ICCIT), 2022, pp. 959–964. [25] N. M. Jakilim, S. Mahamudul Hasan, and E. Hassan, “A benchmark of ma chine learning and deep learning algorithms for detecting fake news in bangla language,” in 2022 4th International Conference on Sustainable Technologies for Industry 4.0 (STI), 2022, pp. 1–6. [26] N. Khan, M. S. Islam, F. Chowdhury, A. S. Siham, and N. Sakib, “Bengali crime news classification based on newspaper headlines using nlp,” in 2022 25th International Conference on Computer and Information Technology (ICCIT), 2022, pp. 194–199. [27] C. S. Office, “Household internet security and informa tion integrity 2021,” (2021), [Dec. 25, 2022][Online]. [On line]. Available: https://www.cso.ie/en/releasesandpublications/ ep/p-isshisi/householdinternetsecurityandinformationintegrity2021/ informationintegrity/ [28] S. Sarker, “Bnlp toolkit,” (Latest Release: Apr 29, 2023). [Online]. Available: https://pypi.org/project/bnlp-toolkit/ [29] A. Hossain, “Bnltk,” (Latest Release: Jun 29, 2019). [Online]. Available: https://pypi.org/project/bnltk/ [30] H. F. Team, “Hugging face,” [Feb. 21, 2023][Online]. [Online]. Available: https://huggingface.co/docs/transformers/index [31] J. Devlin, M. Chang, K. Lee, and K. Toutanova, “BERT: pre-training of deep bidirectional transformers for language understanding,” CoRR, vol. abs/1810.04805, 2018. [Online]. Available: http://arxiv.org/abs/1810.04805 BIBLIOGRAPHY 37 [32] ——, “BERT: pre-training of deep bidirectional transformers for language understanding,” CoRR, vol. abs/1810.04805, 2018. [Online]. Available: http://arxiv.org/abs/1810.04805 [33] S. Sarker, “Banglabert: Bengali mask language model for bengali language understading,” 2020. [Online]. Available: https://github.com/sagorbrur/ bangla-bert [34] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” 2019. [35] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, “Roberta: A robustly optimized bert pre training approach,” 2019. [36] X. Song, A. Salcianu, Y. Song, D. Dopson, and D. Zhou, “Fast wordpiece tokenization,” 2021 en_US
dc.identifier.uri http://hdl.handle.net/123456789/2052
dc.description Supervised by Mr. Md. Hamjajul Ashmafee, Assistant Professor, Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur-1704, Bangladesh en_US
dc.description.abstract The rapid proliferation of fake news poses grave consequences for civil discourse, political environments, and social cohesion. From public elections to mob vio lence, fake news has been leveraged to achieve personal and political gain. The influence of fake news in this era of information is undeniable. Misinformation can cause mass disruption, and we need a way to stop that from happening. This study experiments with the widely used multilingual pre-trained transformers XLM-Roberta and Multilingual-BERT, along with Bangla-BERT. It also explores the impact of stemming in Bengali and demonstrates the effectiveness of combin ing deep neural network (DNN) layers with pre-trained transformers. A major setback faced in this research was the lack of a well-balanced dataset, which led to inconsistent performance from the models. We undersampled two datasets from the original one[1], one with the ratio fake:authentic = 1:1, another with fake:authentic = 1:3. We were able to achieve 0.95 precision and a 0.92 F1 score in a heavily undersampled but well-balanced dataset derived from the original one. XLM-Roberta and Bangla-BERT based models achieved recall scores of 0.94 and 0.93 respectively on the dataset where ratio of fake:authentic is 1:1. Overall, the models trained on the 1:1 dataset delivered consistent scores across all the metrics, which emphasizes the importance of collecting more fake news data for future research. The best model, based on Bangla-BERT, achieved an accuracy of 96.2% which sets a new benchmark accuracy for transformer based models in fake news detection in Bengali. en_US
dc.language.iso en en_US
dc.publisher Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur-1704, Bangladesh en_US
dc.subject Transformers, BERT, Roberta, DNN, Multilingual, Bangla, Fake News en_US
dc.title Fake News Detection in Bengali Language using Transfer Learning Approach en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search IUT Repository


Advanced Search

Browse

My Account

Statistics