Fake News Detection with Credibility Signals

Tarannum, Prerana; Roza, Sabrina Sajneen; Lamia, Rifa Sanjita

dc.contributor.author	Tarannum, Prerana
dc.contributor.author	Roza, Sabrina Sajneen
dc.contributor.author	Lamia, Rifa Sanjita
dc.date.accessioned	2025-03-06T08:02:40Z
dc.date.available	2025-03-06T08:02:40Z
dc.date.issued	2024-06-25
dc.identifier.citation	[1] A. Agrawal, D. Batra, and D. Parikh, “Analyzing the behavior of visual question answering models,” in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, J. Su, K. Duh, and X. Carreras, Eds., Austin, Texas: Association for Computational Linguistics, Nov. 2016, pp. 1955–1960. doi: 10.18653/v1/D16-1203. [Online]. Available: https://aclanthology. org/D16-1203. [2] T. Alhindi, S. Petridis, and S. Muresan, “Where is your evidence: Improving fact-checking by justification modeling,” in Proceedings of the First Workshop on Fact Extraction and VERification (FEVER), J. Thorne, A. Vlachos, O. Cocarascu, C. Christodoulopoulos, and A. Mittal, Eds., Brussels, Belgium: Association for Computational Linguistics, Nov. 2018, pp. 85–90. doi: 10 . 18653 / v1 / W18 - 5513. [Online]. Available: https://aclanthology.org/W18-5513. [3] L. Breiman, “Random forests,” Machine Learning, vol. 45, pp. 5–32, Oct. 2001. doi: 10.1023/A:1010950718922. [4] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Compu tational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), J. Burstein, C. Doran, and T. Solorio, Eds., Minneapolis, Minnesota: Association for Computational Linguistics, Jun. 2019, pp. 4171–4186. doi: 10. 18653/v1/N19-1423. [Online]. Available: https://aclanthology.org/N19- 1423. [5] Z. Guo, M. Schlichtkrull, and A. Vlachos, “A survey on automated fact-checking,” Transactions of the Association for Computational Linguistics, vol. 10, B. Roark and A. Nenkova, Eds., pp. 178–206, 2022. doi: 10.1162/tacl_a_00454. [On line]. Available: https://aclanthology.org/2022.tacl-1.11. [6] C. Hansen, C. Hansen, and L. Chaves Lima, “Automatic fake news detection: Are models learning to reason?” In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), C. Zong, F. Xia, W. Li, and R. Navigli, Eds., Online: Association for Computational Lin 31 guistics, Aug. 2021, pp. 80–86. doi: 10.18653/v1/2021.acl-short.12. [On line]. Available: https://aclanthology.org/2021.acl-short.12. [7] H. Karimi, P. Roy, S. Saba-Sadiya, and J. Tang, “Multi-source multi-class fake news detection,” in Proceedings of the 27th International Conference on Com putational Linguistics, E. M. Bender, L. Derczynski, and P. Isabelle, Eds., Santa Fe, New Mexico, USA: Association for Computational Linguistics, Aug. 2018, pp. 1546–1557. [Online]. Available: https://aclanthology.org/C18-1131. [8] N. Kotonya and F. Toni, “Explainable automated fact-checking for public health claims,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), B. Webber, T. Cohn, Y. He, and Y. Liu, Eds., Online: Association for Computational Linguistics, Nov. 2020, pp. 7740–7754. doi: 10.18653/v1/2020.emnlp- main.623. [Online]. Available: https:// aclanthology.org/2020.emnlp-main.623. [9] J. A. Leite, O. Razuvayevskaya, K. Bontcheva, and C. Scarton, Detecting mis information with llm-predicted credibility signals and weak supervision, 2023. arXiv: 2309.07601 [cs.CL]. [10] C.-Y. Lin, “ROUGE: A package for automatic evaluation of summaries,” in Text Summarization Branches Out, Barcelona, Spain: Association for Computational Linguistics, Jul. 2004, pp. 74–81. [Online]. Available: https://aclanthology. org/W04-1013. [11] Y. Long, Q. Lu, R. Xiang, M. Li, and C.-R. Huang, “Fake news detection through multi-perspective speaker profiles,” in Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers), G. Kondrak and T. Watanabe, Eds., Taipei, Taiwan: Asian Federation of Natural Language Processing, Nov. 2017, pp. 252–256. [Online]. Available: https:// aclanthology.org/I17-2043. [12] K. Popat, S. Mukherjee, J. Strötgen, and G. Weikum, “Where the truth lies: Ex plaining the credibility of emerging claims on the web and social media,” Apr. 2017. doi: 10.1145/3041021.3055133. [13] H. Rashkin, E. Choi, J. Y. Jang, S. Volkova, and Y. Choi, “Truth of varying shades: Analyzing language in fake news and political fact-checking,” in Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, M. Palmer, R. Hwa, and S. Riedel, Eds., Copenhagen, Denmark: Association for Computational Linguistics, Sep. 2017, pp. 2931–2937. doi: 10.18653/v1/D17- 1317. [Online]. Available: https://aclanthology.org/D17-1317. [14] N. Reimers and I. Gurevych, “Sentence-BERT: Sentence embeddings using Siamese BERT-networks,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on 32 Natural Language Processing (EMNLP-IJCNLP), K. Inui, J. Jiang, V. Ng, and X. Wan, Eds., Hong Kong, China: Association for Computational Linguistics, Nov. 2019, pp. 3982–3992. doi: 10.18653/v1/D19-1410. [Online]. Available: https://aclanthology.org/D19-1410. [15] A. Roberts, C. Raffel, and N. Shazeer, “How much knowledge can you pack into the parameters of a language model?” In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), B. Webber, T. Cohn, Y. He, and Y. Liu, Eds., Online: Association for Computational Linguis tics, Nov. 2020, pp. 5418–5426. doi: 10.18653/v1/2020.emnlp- main.437. [Online]. Available: https://aclanthology.org/2020.emnlp-main.437. [16] J. Thorne and A. Vlachos, “Automated fact checking: Task formulations, meth ods and future directions,” in Proceedings of the 27th International Conference on Computational Linguistics, E. M. Bender, L. Derczynski, and P. Isabelle, Eds., Santa Fe, New Mexico, USA: Association for Computational Linguistics, Aug. 2018, pp. 3346–3359. [Online]. Available: https://aclanthology.org/C18- 1283. [17] W. Y. Wang, ““liar, liar pants on fire”: A new benchmark dataset for fake news detection,” in Proceedings of the 55th Annual Meeting of the Association for Com putational Linguistics (Volume 2: Short Papers), R. Barzilay and M.-Y. Kan, Eds., Vancouver, Canada: Association for Computational Linguistics, Jul. 2017, pp. 422– 426. doi: 10.18653/v1/P17-2067. [Online]. Available: https://aclanthology. org/P17-2067. [18] J. Wu and B. Hooi, “Fake News in Sheep’s Clothing: Robust Fake News Detec tion Against LLM-Empowered Style Attacks,” arXiv e-prints, arXiv:2310.10830, arXiv:2310.10830, Oct. 2023. doi: 10.48550/arXiv.2310.10830. arXiv: 2310. 10830 [cs.CL]. [19] F. Yang, S. K. Pentyala, S. Mohseni, et al., “Xfake: Explainable fake news de tector with visualizations,” in The World Wide Web Conference, ser. WWW ’19, ACM, May 2019. doi: 10.1145/3308558.3314119. [Online]. Available: http: //dx.doi.org/10.1145/3308558.3314119. [20] Z. Yang, J. Ma, H. Chen, H. Lin, Z. Luo, and Y. Chang, “A coarse-to-fine cas caded evidence-distillation neural network for explainable fake news detec tion,” in Proceedings of the 29th International Conference on Computational Lin guistics, N. Calzolari, C.-R. Huang, H. Kim, et al., Eds., Gyeongju, Republic of Korea: International Committee on Computational Linguistics, Oct. 2022, pp. 2608–2621. [Online]. Available: https : / / aclanthology . org / 2022 . coling-1.230.	en_US
dc.identifier.uri	http://hdl.handle.net/123456789/2363
dc.description	Supervised by Dr. Md. Azam Hossain, Associate Professor, Department of Computer Science and Engineering (CSE) Islamic University of Technology (IUT) Board Bazar, Gazipur, Bangladesh This thesis is submitted in partial fulfillment of the requirement for the degree of Bachelor of Science in Computer Science and Engineering, 2024	en_US
dc.description.abstract	In the era of huge information, it is necessary to justify the online contents whether it is true or false. Nowadays it is a great challenge to detect false information as the online contents sometimes show numerous misinformation. The proposed model in troduces a method to detect misinformation significantly to generate different credi bility signals that ensures the truthfulness and authenticity of the content. It involves advanced computational requirements and machine learning algorithm techniques that specify online contents focusing on the extraction of credibility signals to en hance the credibility with reliable sentiment of the content. This method proposes a novel approach that leverages 8 different distinct credibility signals. This method in novatively utilizes the credibility signals like Emotional Violence, Incorrect Spelling, Evidence, Source Credibility, Polarized Language, Bias, Writing Quality, and Contra diction of Established Facts. Our proposed methodology works upon use as a fact checker with high quality and efficiency on the Politifact dataset which contains more than 21,000 unique statements that are also verified. The extraction of the credibility signals makes it more complex to determine the statement with its robustness. So our proposed model introduces a comprehensive model pipeline to improve the adaptabil ity of fake news from several online contents. We also made an experimental study on comparing the performance of the model with several pre-trained models like BERT, RoBERTa, XLNet, AlBERT, and ChatGPT. The proposed model shows a significant improvement in model accuracy and F1 score that makes the model superior repre sentation in detecting fake news. Regardless of these progressions, the study high lights challenges to extract the credibility signals. Future research aims to integrate more credibility signals to enhance the models performance and explainability by also working on multilingual perspects. The development of the model introduces an ef fective prompting to detect the trustworthiness of the online contents to give more accurate predictions.	en_US
dc.language.iso	en	en_US
dc.publisher	Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur-1704, Bangladesh	en_US
dc.subject	Fake news detection, credibility signals,random forest, softvoting, gradient boosting	en_US
dc.title	Fake News Detection with Credibility Signals	en_US
dc.type	Thesis	en_US