Multihop Factual Claim Verification Using Natural Language Prompts

Rahman, Md. Mezbaur

dc.contributor.author	Rahman, Md. Mezbaur
dc.date.accessioned	2024-01-18T08:51:12Z
dc.date.available	2024-01-18T08:51:12Z
dc.date.issued	2023-06-30
dc.identifier.citation	[1] G. Bekoulis, C. Papagiannopoulou, and N. Deligiannis, “A review on fact extrac tion and verification,” ACM Computing Surveys (CSUR), vol. 55, no. 1, pp. 1–35, 2021. [2] J. Thorne, A. Vlachos, C. Christodoulopoulos, and A. Mittal, “Fever: a large scale dataset for fact extraction and verification,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 2018, pp. 809–819. [3] D. Stammbach and G. Neumann, “Team domlin: Exploiting evidence enhance ment for the fever shared task,” in Proceedings of the Second Workshop on Fact Extraction and VERification (FEVER), 2019, pp. 105–109. [4] Y. Jiang, S. Bordia, Z. Zhong, C. Dognin, M. Singh, and M. Bansal, “Hover: A dataset for many-hop fact extraction and claim verification,” in Findings of the Association for Computational Linguistics: EMNLP 2020, 2020, pp. 3441–3460. [5] J. Chen, R. Zhang, J. Guo, Y. Fan, and X. Cheng, “Gere: Generative evidence retrieval for fact verification,” arXiv preprint arXiv:2204.05511, 2022. [6] A. Soleimani, C. Monz, and M. Worring, “Bert for evidence retrieval and claim verification,” in European Conference on Information Retrieval. Springer, 2020, pp. 359–366. [7] T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Nee lakantan, P. Shyam, G. Sastry, A. Askell et al., “Language models are few-shot learners,” Advances in neural information processing systems, vol. 33, pp. 1877– 1901, 2020. [8] T. Gao, A. Fisch, and D. Chen, “Making pre-trained language models better few-shot learners,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Online: Association 47 for Computational Linguistics, Aug. 2021, pp. 3816–3830. [Online]. Available: https://aclanthology.org/2021.acl-long.295 [9] X. Han, W. Zhao, N. Ding, Z. Liu, and M. Sun, “PTR: prompt tuning with rules for text classification,” CoRR, vol. abs/2105.11259, 2021. [Online]. Available: https://arxiv.org/abs/2105.11259 [10] R. Seoh, I. Birle, M. Tak, H.-S. Chang, B. Pinette, and A. Hough, “Open aspect target sentiment classification with natural language prompts,” in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Online and Punta Cana, Dominican Republic: Association for Computational Linguistics, Nov. 2021, pp. 6311–6322. [Online]. Available: https://aclanthology.org/2021.emnlp-main.509 [11] C. Li, F. Gao, J. Bu, L. Xu, X. Chen, Y. Gu, Z. Shao, Q. Zheng, N. Zhang, Y. Wang et al., “Sentiprompt: Sentiment knowledge enhanced prompt-tuning for aspect-based sentiment analysis,” arXiv preprint arXiv:2109.08306, 2021. [12] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” arXiv preprint arXiv:1301.3781, 2013. [13] J. Pennington, R. Socher, and C. Manning, “GloVe: Global vectors for word representation,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha, Qatar: Association for Computational Linguistics, Oct. 2014, pp. 1532–1543. [Online]. Available: https://aclanthology.org/D14-1162 [14] Y. Bengio, R. Ducharme, and P. Vincent, “A neural probabilistic language model,” Advances in neural information processing systems, vol. 13, 2000. [15] A. Sherstinsky, “Fundamentals of recurrent neural network (rnn) and long short term memory (lstm) network,” Physica D: Nonlinear Phenomena, vol. 404, p. 132306, 2020. [16] J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, “Empirical evaluation of gated re current neural networks on sequence modeling,” arXiv preprint arXiv:1412.3555, 2014. [17] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computa tion, vol. 9, no. 8, pp. 1735–1780, 1997. [18] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017. 48 [19] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis, Minnesota: Association for Computational Linguistics, Jun. 2019, pp. 4171–4186. [Online]. Available: https://aclanthology. org/N19-1423 [20] C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, and P. J. Liu, “Exploring the limits of transfer learning with a unified text-to-text transformer,” Journal of Machine Learning Research, vol. 21, pp. 1–67, 2020. [21] A. Radford, K. Narasimhan, T. Salimans, I. Sutskever et al., “Improving language understanding by generative pre-training,” 2018. [22] D. Chen, A. Fisch, J. Weston, and A. Bordes, “Reading Wikipedia to answer open-domain questions,” in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Vancouver, Canada: Association for Computational Linguistics, Jul. 2017, pp. 1870–1879. [Online]. Available: https://aclanthology.org/P17-1171 [23] A. Hanselowski, H. Zhang, Z. Li, D. Sorokin, B. Schiller, C. Schulz, and I. Gurevych, “Ukp-athene: Multi-sentence textual entailment for claim verifica tion,” in Proceedings of the First Workshop on Fact Extraction and VERification (FEVER), 2018, pp. 103–108. [24] Q. Chen, X. Zhu, Z.-H. Ling, S. Wei, H. Jiang, and D. Inkpen, “Enhanced lstm for natural language inference,” in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2017, pp. 1657–1668. [25] D. Sorokin and I. Gurevych, “Mixing context granularities for improved entity linking on question answering data across entity categories,” in Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics. New Orleans, Louisiana: Association for Computational Linguistics, Jun. 2018, pp. 65–75. [Online]. Available: https://aclanthology.org/S18-2007 [26] Y. Nie, H. Chen, and M. Bansal, “Combining fact extraction and verification with neural semantic matching networks,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 01, 2019, pp. 6859–6866. [27] O. Khattab, C. Potts, and M. Zaharia, “Baleen: Robust multi-hop reasoning at scale via condensed retrieval,” Advances in Neural Information Processing Sys tems, vol. 34, pp. 27 670–27 682, 2021. 49 [28] J. Zhou, X. Han, C. Yang, Z. Liu, L. Wang, C. Li, and M. Sun, “Gear: Graph based evidence aggregating and reasoning for fact verification,” arXiv preprint arXiv:1908.01843, 2019. [29] P. Liu, W. Yuan, J. Fu, Z. Jiang, H. Hayashi, and G. Neubig, “Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language pro cessing,” arXiv preprint arXiv:2107.13586, 2021. [30] M. E. Peters, S. Ruder, and N. A. Smith, “To tune or not to tune? adapting pretrained representations to diverse tasks,” in Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019), 2019, pp. 7–14. [31] T. Gao, “Prompting: Better ways of using language models for nlp tasks,” The Gradient, 2021. [32] F. Petroni, T. Rocktäschel, S. Riedel, P. Lewis, A. Bakhtin, Y. Wu, and A. Miller, “Language models as knowledge bases?” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Hong Kong, China: Association for Computational Linguistics, Nov. 2019, pp. 2463–2473. [Online]. Available: https://aclanthology.org/D19-1250 [33] J. Davison, J. Feldman, and A. M. Rush, “Commonsense knowledge mining from pretrained models,” in Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on nat ural language processing (EMNLP-IJCNLP), 2019, pp. 1173–1178. [34] Z. Jiang, F. F. Xu, J. Araki, and G. Neubig, “How can we know what language models know?” Transactions of the Association for Computational Linguistics, vol. 8, pp. 423–438, 2020. [Online]. Available: https://aclanthology.org/2020. tacl-1.28 [35] A. Talmor, Y. Elazar, Y. Goldberg, and J. Berant, “oLMpics-on what language model pre-training captures,” Transactions of the Association for Computational Linguistics, vol. 8, pp. 743–758, 2020. [Online]. Available: https://aclanthology.org/2020.tacl-1.48 [36] T. Schick and H. Schütze, “Exploiting cloze questions for few-shot text classification and natural language inference,” CoRR, vol. abs/2001.07676, 2020. [Online]. Available: https://arxiv.org/abs/2001.07676 [37] T. L. Scao and A. M. Rush, “How many data points is a prompt worth?” arXiv preprint arXiv:2103.08493, 2021. 50 [38] S. Hu, N. Ding, H. Wang, Z. Liu, J. Wang, J. Li, W. Wu, and M. Sun, “Knowl edgeable prompt-tuning: Incorporating knowledge into prompt verbalizer for text classification,” arXiv preprint arXiv:2108.02035, 2021. [39] Z. Zhao, E. Wallace, S. Feng, D. Klein, and S. Singh, “Calibrate before use: Im proving few-shot performance of language models,” in International Conference on Machine Learning. PMLR, 2021, pp. 12 697–12 706. [40] N. Ding, S. Hu, W. Zhao, Y. Chen, Z. Liu, H. Zheng, and M. Sun, “Openprompt: An open-source framework for prompt-learning,” in Proceedings of the 60th An nual Meeting of the Association for Computational Linguistics: System Demon strations, 2022, pp. 105–113. [41] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettle moyer, and V. Stoyanov, “Roberta: A robustly optimized bert pretraining ap proach,” arXiv preprint arXiv:1907.11692, 2019. [42] Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma, and R. Soricut, “Albert: A lite bert for self-supervised learning of language representations,” in Interna tional Conference on Learning Representations, 2019. [43] R. Nogueira, Z. Jiang, R. Pradeep, and J. Lin, “Document ranking with a pretrained sequence-to-sequence model,” in Findings of the Associa tion for Computational Linguistics: EMNLP 2020. Online: Association for Computational Linguistics, Nov. 2020, pp. 708–718. [Online]. Available: https://aclanthology.org/2020.findings-emnlp.63 [44] M. T. R. Laskar, E. Hoque, and J. X. Huang, “Domain adaptation with pre-trained transformers for query-focused abstractive text summarization,” Computational Linguistics, vol. 48, no. 2, pp. 279–320, Jun. 2022. [Online]. Available: https://aclanthology.org/2022.cl-2.2 [45] M. T. R. Laskar, E. Hoque, and J. Huang, “Query focused abstractive summa rization via incorporating query relevance and transfer learning with transformer models,” in Advances in Artificial Intelligence: 33rd Canadian Conference on Ar tificial Intelligence, Canadian AI 2020, Ottawa, ON, Canada, May 13–15, 2020, Proceedings 33. Springer, 2020, pp. 342–348. [46] S. M. S. Ekram, A. A. Rahman, M. S. Altaf, M. S. Islam, M. M. Rahman, M. M. Rahman, M. A. Hossain, and A. R. M. Kamal, “Banglarqa: A benchmark dataset for under-resourced bangla language reading comprehension-based question an swering with diverse question-answer types,” in Findings of the Association for Computational Linguistics: EMNLP 2022, 2022, pp. 2518–2532. 51 [47] M. T. R. Laskar, E. Hoque, and J. X. Huang, “WSL-DS: Weakly supervised learning with distant supervision for query focused multi-document abstractive summarization,” in Proceedings of the 28th International Conference on Computational Linguistics. Barcelona, Spain (Online): International Committee on Computational Linguistics, Dec. 2020, pp. 5647–5654. [Online]. Available: https://aclanthology.org/2020.coling-main.495 [48] M. T. R. Laskar, X. Huang, and E. Hoque, “Contextualized embeddings based transformer encoder for sentence similarity modeling in answer selection task,” in Proceedings of the Twelfth Language Resources and Evaluation Conference, 2020, pp. 5505–5514.	en_US
dc.identifier.uri	http://hdl.handle.net/123456789/2066
dc.description	Supervised by Dr. Md. Azam Hossain, Assistant Professor, Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur-1704, Bangladesh	en_US
dc.description.abstract	Verifying a claim/statement using facts as evidence can be challenging, especially when the evidence consists of multiple sentences, making it difficult for NLP models to understand long-range dependencies. Most of the existing datasets provide claims that can be verified by single-hop reasoning i.e., relevant evidence to support or deny the claim can be found in a single evidence source. But the task becomes substantially challenging when it is required to mine evidence from multiple sources to correctly reach a verdict about the claim. Successful methods in single-hop verification task struggle to perform at a higher level when provided with a claim that requires multihop evidence in order to be verified. In light of the success of prompt learning in various NLP applications, this thesis introduces prompt learning for the multi-hop claim veri fication task. Through extensive experimentation, our proposed prompt-based method, which employs manually constructed prompts, has yielded promising results. By fine-tuning language models with prompts, we have achieved an accuracy of 83.9%, along with an enhanced cross-domain generalization performance. Additionally, we conducted experiments in few-shot and zero-shot settings, which demonstrated that prompt-based methods outperformed traditional supervised learning techniques that rely on the fine-tuning paradigm. These results underscore the effectiveness of prompt learning in the realm of claim verification	en_US
dc.language.iso	en	en_US
dc.publisher	Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur-1704, Bangladesh	en_US
dc.subject	Natural Language Processing, Claim Verification, Multihop Claim Verification, Prompt Learning	en_US
dc.title	Multihop Factual Claim Verification Using Natural Language Prompts	en_US
dc.type	Thesis	en_US