Improving Faithfulness in Medical Text Summarization: An LLM-Based Approach

Oeshy, Nafisa Tabassum; Mostofa, Ajwad Abrar; Maheru, Prianka

dc.contributor.author	Oeshy, Nafisa Tabassum
dc.contributor.author	Mostofa, Ajwad Abrar
dc.contributor.author	Maheru, Prianka
dc.date.accessioned	2025-03-10T08:31:31Z
dc.date.available	2025-03-10T08:31:31Z
dc.date.issued	2024-07-04
dc.identifier.citation	[1] F. Alharbi, “The use of digital healthcare platforms during the covid-19 pan demic: The consumer perspective,” Acta Informatica Medica, vol. 29, no. 1, p. 51, 2021. [2] J. Maynez, S. Narayan, B. Bohnet, and R. McDonald, “On faithfulness and fac tuality in abstractive summarization,” arXiv preprint arXiv:2005.00661, 2020. [3] Y. Huang, X. Feng, X. Feng, and B. Qin, “The factual inconsistency problem in abstractive text summarization: A survey,” arXiv preprint arXiv:2104.14839, 2021. [4] W. S. El-Kassas, C. R. Salama, A. A. Rafea, and H. K. Mohamed, “Automatic text summarization: A comprehensive survey,” Expert systems with applications, vol. 165, p. 113 679, 2021. [5] S. Afantenos, V. Karkaletsis, and P. Stamatopoulos, “Summarization from med ical documents: A survey,” Artificial intelligence in medicine, vol. 33, no. 2, pp. 157–177, 2005. [6] H. Zhang, X. Liu, and J. Zhang, “Summit: Iterative text summarization via chat gpt,” arXiv preprint arXiv:2305.14835, 2023. [7] D. Morozovskii and S. Ramanna, “Rare words in text summarization,” Natural Language Processing Journal, vol. 3, p. 100 014, 2023. [8] J. DeYoung, I. Beltagy, M. van Zuylen, B. Kuehl, and L. L. Wang, “Ms2: Multi document summarization of medical studies,” arXiv preprint arXiv:2104.06486, 2021. 45 [9] A. P. Wibawa, F. Kurniawan, et al., “A survey of text summarization: Tech niques, evaluation and challenges,” Natural Language Processing Journal, vol. 7, p. 100 070, 2024. [10] A. Sarker, Y.-C. Yang, M. A. Al-Garadi, and A. Abbas, “A light-weight text summarization system for fast access to medical evidence,” Frontiers in digital health, vol. 2, p. 585 559, 2020. [11] A. Ben Abacha and D. Demner-Fushman, “On the summarization of consumer health questions,” in Proceedings of the 57th Annual Meeting of the Associa tion for Computational Linguistics, A. Korhonen, D. Traum, and L. Màrquez, Eds., Florence, Italy: Association for Computational Linguistics, Jul. 2019, pp. 2228–2234. doi: 10 . 18653 / v1 / P19 - 1215. [Online]. Available: https : //aclanthology.org/P19-1215. [12] A. Rush, S. Harvard, S. Chopra, and J. Weston, “A neural attention model for sentence summarization. aclweb,” in Proceedings of the 2015 conference on em pirical methods in natural language processing, 2017. [13] S. Gehrmann, Z. Ziegler, and A. M. Rush, “Generating abstractive summaries with finetuned language models,” in Proceedings of the 12th International Con ference on Natural Language Generation, 2019, pp. 516–522. [14] R. Nallapati, B. Zhou, C. dos Santos, Ç. Gulçehre, and B. Xiang, “Abstractive text summarization using sequence-to-sequence RNNs and beyond,” in Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, S. Riezler and Y. Goldberg, Eds., Berlin, Germany: Association for Computational Linguistics, Aug. 2016, pp. 280–290. doi: 10.18653/v1/K16- 1028. [Online]. Available: https://aclanthology.org/K16-1028. [15] S. Gehrmann, Z. Ziegler, and A. Rush, “Generating abstractive summaries with finetuned language models,” in Proceedings of the 12th International Confer ence on Natural Language Generation, K. van Deemter, C. Lin, and H. Taka mura, Eds., Tokyo, Japan: Association for Computational Linguistics, Oct. 2019, pp. 516–522. doi: 10 . 18653 / v1 / W19 - 8665. [Online]. Available: https : / / aclanthology.org/W19-8665. [16] H. Hardy and A. Vlachos, “Guided neural language generation for abstrac tive summarization using Abstract Meaning Representation,” in Proceedings of 46 the 2018 Conference on Empirical Methods in Natural Language Processing, E. Riloff, D. Chiang, J. Hockenmaier, and J. Tsujii, Eds., Brussels, Belgium: Asso ciation for Computational Linguistics, Oct. 2018, pp. 768–773. doi: 10.18653/ v1/D18-1086. [Online]. Available: https://aclanthology.org/D18-1086. [17] M. Cao, Y. Dong, J. Wu, and J. C. K. Cheung, “Factual error correction for ab stractive summarization models,” arXiv preprint arXiv:2010.08712, 2020. [18] Q. Xie, Z. Luo, B. Wang, and S. Ananiadou, “A survey for biomedical text summarization: From pre-trained to large language models,” arXiv preprint arXiv:2304.08763, 2023. [19] S. Chen, F. Zhang, K. Sone, and D. Roth, “Improving faithfulness in abstrac tive summarization with contrast candidate generation and selection,” arXiv preprint arXiv:2104.09061, 2021. [20] Y. Mao, X. Ren, H. Ji, and J. Han, “Constrained abstractive summarization: Preserving factual consistency with constrained generation,” arXiv preprint arXiv:2010.12723, 2020. [21] W. Kryscinski, B. McCann, C. Xiong, and R. Socher, “Evaluating the factual consistency of abstractive text summarization,” in Proceedings of the 2020 Con ference on Empirical Methods in Natural Language Processing (EMNLP), B. Web ber, T. Cohn, Y. He, and Y. Liu, Eds., Online: Association for Computational Lin guistics, Nov. 2020, pp. 9332–9346. doi: 10.18653/v1/2020.emnlp-main.750. [Online]. Available: https://aclanthology.org/2020.emnlp-main.750. [22] C. Zhu, W. Hinthorn, R. Xu, et al., “Enhancing factual consistency of abstrac tive summarization,” in Proceedings of the 2021 Conference of the North Ameri can Chapter of the Association for Computational Linguistics: Human Language Technologies, K. Toutanova, A. Rumshisky, L. Zettlemoyer, et al., Eds., On line: Association for Computational Linguistics, Jun. 2021, pp. 718–733. doi: 10 . 18653 / v1 / 2021 . naacl - main . 58. [Online]. Available: https : / / aclanthology.org/2021.naacl-main.58. [23] A. Pagnoni, V. Balachandran, and Y. Tsvetkov, “Understanding factuality in ab stractive summarization with FRANK: A benchmark for factuality metrics,” in Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, K. 47 Toutanova, A. Rumshisky, L. Zettlemoyer, et al., Eds., Online: Association for Computational Linguistics, Jun. 2021, pp. 4812–4829. doi: 10 . 18653 / v1 / 2021.naacl- main.383. [Online]. Available: https://aclanthology.org/ 2021.naacl-main.383. [24] W. Kryściński, B. McCann, C. Xiong, and R. Socher, “Evaluating the factual con sistency of abstractive text summarization,” arXiv preprint arXiv:1910.12840, 2019. [25] A. Tariq, A. Urooj, S. Trivedi, et al., “Patient centric summarization of radiology findings using large language models,” medRxiv, pp. 2024–02, 2024. [26] S. Liu and C. G. Healey, “Abstractive summarization of large document collec tions using gpt,” arXiv preprint arXiv:2310.05690, 2023. [27] S. N. Turky, A. S. A. Al-Jumaili, and R. K. Hasoun, “Abstractive text summary of covid-19 documents based on lstm method and word embedding.,” Webology, vol. 18, no. 2, 2021. [28] G. Michalopoulos, K. Williams, G. Singh, and T. Lin, “MedicalSum: A guided clinical abstractive summarization model for generating medical reports from patient-doctor conversations,” in Findings of the Association for Computational Linguistics: EMNLP 2022, Y. Goldberg, Z. Kozareva, and Y. Zhang, Eds., Abu Dhabi, United Arab Emirates: Association for Computational Linguistics, Dec. 2022, pp. 4741–4749. doi: 10.18653/v1/2022.findings- emnlp.349. [On line]. Available: https://aclanthology.org/2022.findings-emnlp.349. [29] A. Khan, F. Kamal, M. A. Chowdhury, T. Ahmed, M. T. R. Laskar, and S. Ahmed, “BanglaCHQ-summ: An abstractive summarization dataset for medi cal queries in Bangla conversational speech,” in Proceedings of the First Work shop on Bangla Language Processing (BLP-2023), F. Alam, S. Kar, S. A. Chowd hury, F. Sadeque, and R. Amin, Eds., Singapore: Association for Computational Linguistics, Dec. 2023, pp. 85–93. doi: 10.18653/v1/2023.banglalp-1.10. [Online]. Available: https://aclanthology.org/2023.banglalp-1.10. [30] M. Gambhir and V. Gupta, “Recent automatic text summarization techniques: A survey,” Artificial Intelligence Review, vol. 47, no. 1, pp. 1–66, 2017. 48 [31] J. Madhuri and R. G. Kumar, “Extractive text summarization using sentence ranking,” in 2019 international conference on data science and communication (IconDSC), IEEE, 2019, pp. 1–3. [32] N. Moratanch and S. Chitrakala, “A survey on extractive text summarization,” in 2017 international conference on computer, communication and signal pro cessing (ICCCSP), IEEE, 2017, pp. 1–6. [33] R. Ferreira, L. de Souza Cabral, R. D. Lins, et al., “Assessing sentence scoring techniques for extractive text summarization,” Expert systems with applications, vol. 40, no. 14, pp. 5755–5764, 2013. [34] Q.-A. Nguyen, K.-V. Nguyen, H. Q. Le, et al., “Integrating ontology-based knowledge to improve biomedical multi-document summarization model,” in Asian Conference on Intelligent Information and Database Systems, Springer, 2023, pp. 99–110. [35] V. Gupta and G. S. Lehal, “A survey of text summarization extractive tech niques,” Journal of emerging technologies in web intelligence, vol. 2, no. 3, pp. 258–268, 2010. [36] A. See, P. J. Liu, and C. D. Manning, Get to the point: Summarization with pointer-generator networks, 2017. arXiv: 1704.04368 [cs.CL]. [37] J. Xu and G. Durrett, “Neural extractive text summarization with syntactic com pression,” in Proceedings of the 2019 Conference on Empirical Methods in Nat ural Language Processing and the 9th International Joint Conference on Natu ral Language Processing (EMNLP-IJCNLP), K. Inui, J. Jiang, V. Ng, and X. Wan, Eds., Hong Kong, China: Association for Computational Linguistics, Nov. 2019, pp. 3292–3303. doi: 10 . 18653 / v1 / D19 - 1324. [Online]. Available: https : //aclanthology.org/D19-1324. [38] T. A. Phan, N. D. Nguyen, and K.-H. N. Bui, “Extractive text summarization with latent topics using heterogeneous graph neural network,” in Proceedings of the 36th Pacific Asia Conference on Language, Information and Computation, S. Dita, A. Trillanes, and R. I. Lucas, Eds., Manila, Philippines: Association for Computational Linguistics, Oct. 2022, pp. 749–756. [Online]. Available: https: //aclanthology.org/2022.paclic-1.82. 49 [39] J. Pilault, R. Li, S. Subramanian, and C. Pal, “On extractive and abstractive neu ral document summarization with transformer language models,” in Proceed ings of the 2020 Conference on Empirical Methods in Natural Language Process ing (EMNLP), B. Webber, T. Cohn, Y. He, and Y. Liu, Eds., Online: Association for Computational Linguistics, Nov. 2020, pp. 9308–9319. doi: 10.18653/v1/ 2020.emnlp- main.748. [Online]. Available: https://aclanthology.org/ 2020.emnlp-main.748. [40] J. Shah and S. Mohammed, “Clinical narrative summarization based on the mimic iii dataset,” Int. J. Multimed. Ubiquit. Eng, vol. 15, no. 2, pp. 49–60, 2020. [41] H. Kim, J.-E. Kim, and H. Kim, “-based extractive summarization via mrc framework,” in Proceedings of the 2024 Joint International Conference on Com putational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), 2024, pp. 16 175–16 186. [42] J. Liang, C.-H. Tsou, and A. Poddar, “A novel system for extractive clinical note summarization using EHR data,” in Proceedings of the 2nd Clinical Natural Lan guage Processing Workshop, A. Rumshisky, K. Roberts, S. Bethard, and T. Nau mann, Eds., Minneapolis, Minnesota, USA: Association for Computational Lin guistics, Jun. 2019, pp. 46–54. doi: 10.18653/v1/W19-1906. [Online]. Avail able: https://aclanthology.org/W19-1906. [43] D. MONTESI and T. C. XIA, “Subtopic-oriented biomedical summarization us ing pretrained language models,” [44] D. D. A. Bui, G. Del Fiol, J. F. Hurdle, and S. Jonnalagadda, “Extractive text summarization system to aid data extraction from full text in systematic review development,” Journal of biomedical informatics, vol. 64, pp. 265–272, 2016. [45] W. Yoon, R. Jackson, A. Lagerberg, and J. Kang, “Sequence tagging for biomed ical extractive question answering,” Bioinformatics, vol. 38, no. 15, pp. 3794– 3801, 2022. [46] M. Yousefi-Azar and L. Hamey, “Text summarization using unsupervised deep learning,” Expert Systems with Applications, vol. 68, pp. 93–105, 2017. 50 [47] H. Gupta and M. Patel, “Method of text summarization using lsa and sentence based topic modelling with bert,” in 2021 international conference on artificial intelligence and smart systems (ICAIS), IEEE, 2021, pp. 511–517. [48] M. A. Fattah and F. Ren, “Automatic text summarization,” World Academy of Science, Engineering and Technology, vol. 37, no. 2, p. 192, 2008. [49] S. Abdel-Salam and A. Rafea, “Performance study on extractive text summa rization using bert models,” Information, vol. 13, no. 2, p. 67, 2022. [50] R. C. Belwal, S. Rai, and A. Gupta, “Text summarization using topic-based vec tor space model and semantic measure,” Information Processing & Manage ment, vol. 58, no. 3, p. 102 536, 2021. [51] Y. Liu and M. Lapata, “Text summarization with pretrained encoders,” arXiv preprint arXiv:1908.08345, 2019. [52] H. Zhang, J. Cai, J. Xu, and J. Wang, “Pretraining-based natural language gen eration for text summarization,” in Proceedings of the 23rd Conference on Com putational Natural Language Learning (CoNLL), M. Bansal and A. Villavicen cio, Eds., Hong Kong, China: Association for Computational Linguistics, Nov. 2019, pp. 789–797. doi: 10.18653/v1/K19-1074. [Online]. Available: https: //aclanthology.org/K19-1074. [53] Y. Liu and M. Lapata, “Text summarization with pretrained encoders,” in Pro ceedings of the 2019 Conference on Empirical Methods in Natural Language Pro cessing and the 9th International Joint Conference on Natural Language Process ing (EMNLP-IJCNLP), K. Inui, J. Jiang, V. Ng, and X. Wan, Eds., Hong Kong, China: Association for Computational Linguistics, Nov. 2019, pp. 3730–3740. doi: 10.18653/v1/D19-1387. [Online]. Available: https://aclanthology. org/D19-1387. [54] Y. Gao, D. Dligach, T. Miller, D. Xu, M. M. M. Churpek, and M. Afshar, “Sum marizing patients’ problems from hospital progress notes using pre-trained sequence-to-sequence models,” in Proceedings of the 29th International Con ference on Computational Linguistics, N. Calzolari, C.-R. Huang, H. Kim, et al., Eds., Gyeongju, Republic of Korea: International Committee on Compu tational Linguistics, Oct. 2022, pp. 2979–2991. [Online]. Available: https:// aclanthology.org/2022.coling-1.264. 51 [55] T. Goodwin, M. Savery, and D. Demner-Fushman, “Flight of the PEGASUS? comparing transformers on few-shot and zero-shot multi-document abstractive summarization,” in Proceedings of the 28th International Conference on Com putational Linguistics, D. Scott, N. Bel, and C. Zong, Eds., Barcelona, Spain (Online): International Committee on Computational Linguistics, Dec. 2020, pp. 5640–5646. doi: 10.18653/v1/2020.coling-main.494. [Online]. Avail able: https://aclanthology.org/2020.coling-main.494. [56] P. He, B. Peng, S. Wang, et al., “Z-code++: A pre-trained language model opti mized for abstractive summarization,” in Proceedings of the 61st Annual Meet ing of the Association for Computational Linguistics (Volume 1: Long Papers), A. Rogers, J. Boyd-Graber, and N. Okazaki, Eds., Toronto, Canada: Association for Computational Linguistics, Jul. 2023, pp. 5095–5112. doi: 10.18653/v1/2023. acl-long.279. [Online]. Available: https://aclanthology.org/2023.acl long.279. [57] B. Yu, “Evaluating pre-trained language models on multi-document summa rization for literature reviews,” in Proceedings of the Third Workshop on Schol arly Document Processing, A. Cohan, G. Feigenblat, D. Freitag, et al., Eds., Gyeongju, Republic of Korea: Association for Computational Linguistics, Oct. 2022, pp. 188–192. [Online]. Available: https://aclanthology.org/2022. sdp-1.22. [58] D. F. Navarro, M. Dras, and S. Berkovsky, “Few-shot fine-tuning SOTA summa rization models for medical dialogues,” in Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguis tics: Human Language Technologies: Student Research Workshop, D. Ippolito, L. H. Li, M. L. Pacheco, D. Chen, and N. Xue, Eds., Hybrid: Seattle, Washing ton + Online: Association for Computational Linguistics, Jul. 2022, pp. 254– 266. doi: 10 . 18653 / v1 / 2022 . naacl - srw . 32. [Online]. Available: https : //aclanthology.org/2022.naacl-srw.32. [59] M. Liu, D. Zhang, W. Tan, and H. Zhang, “DeakinNLP at ProbSum 2023: Clinical progress note summarization with rules and language ModelsClini cal progress note summarization with rules and languague models,” in The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, D. Demner-fushman, S. Ananiadou, and K. Cohen, Eds., Toronto, Canada: Association for Computational Linguistics, Jul. 2023, pp. 491–496. 52 doi: 10 . 18653 / v1 / 2023 . bionlp - 1 . 47. [Online]. Available: https : / / aclanthology.org/2023.bionlp-1.47. [60] M. Monajatipoor, J. Yang, J. Stremmel, et al., “Llms in biomedicine: A study on clinical named entity recognition,” arXiv preprint arXiv:2404.07376, 2024. [61] L. Tang, Z. Sun, B. Idnay, et al., “Evaluating large language models on medical evidence summarization,” npj Digital Medicine, vol. 6, no. 1, p. 158, 2023. [62] I. García-Ferrero, R. Agerri, A. A. Salazar, et al., “Medical mt5: An open source multilingual text-to-text llm for the medical domain,” arXiv preprint arXiv:2404.07613, 2024. [63] J. Wang, Y. Liang, F. Meng, et al., “Zero-shot cross-lingual summarization via large language models,” in Proceedings of the 4th New Frontiers in Summariza tion Workshop, Y. Dong, W. Xiao, L. Wang, F. Liu, and G. Carenini, Eds., Singa pore: Association for Computational Linguistics, Dec. 2023, pp. 12–23. doi: 10. 18653/v1/2023.newsum-1.2. [Online]. Available: https://aclanthology. org/2023.newsum-1.2. [64] L. Basyal and M. Sanghvi, Text summarization using large language models: A comparative study of mpt-7b-instruct, falcon-7b-instruct, and openai chat-gpt models, 2023. arXiv: 2310.10449 [cs.CL]. [65] H. Zhang, X. Liu, and J. Zhang, “Extractive summarization via chatgpt for faith ful summary generation,” arXiv preprint arXiv:2304.04193, 2023. [66] D. Van Veen, C. Van Uden, L. Blankemeier, et al., “Clinical text summarization: Adapting large language models can outperform human experts,” Research Square, 2023. [67] K. B. Ozler and S. Bethard, “Clulab at MEDIQA-chat 2023: Summarization and classification of medical dialogues,” in Proceedings of the 5th Clinical Nat ural Language Processing Workshop, T. Naumann, A. Ben Abacha, S. Bethard, K. Roberts, and A. Rumshisky, Eds., Toronto, Canada: Association for Com putational Linguistics, Jul. 2023, pp. 144–149. doi: 10 . 18653 / v1 / 2023 . clinicalnlp-1.19. [Online]. Available: https://aclanthology.org/2023. clinicalnlp-1.19. 53 [68] Y. Mathur, S. Rangreji, R. Kapoor, M. Palavalli, A. Bertsch, and M. R. Gormley, Summqa at mediqa-chat 2023:in-context learning with gpt-4 for medical summa rization, 2023. arXiv: 2306.17384 [cs.CL]. [69] E. Durmus, H. He, and M. Diab, “Feqa: A question answering evaluation frame work for faithfulness assessment in abstractive summarization,” arXiv preprint arXiv:2005.03754, 2020. [70] J. Maynez, S. Narayan, B. Bohnet, and R. McDonald, “On faithfulness and factu ality in abstractive summarization,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, D. Jurafsky, J. Chai, N. Schluter, and J. Tetreault, Eds., Online: Association for Computational Linguistics, Jul. 2020, pp. 1906–1919. doi: 10.18653/v1/2020.acl-main.173. [Online]. Avail able: https://aclanthology.org/2020.acl-main.173. [71] Q. Jia, S. Ren, Y. Liu, and K. Zhu, “Zero-shot faithfulness evaluation for text summarization with foundation language model,” in Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, H. Bouamor, J. Pino, and K. Bali, Eds., Singapore: Association for Computational Linguistics, Dec. 2023, pp. 11 017–11 031. doi: 10 . 18653 / v1 / 2023 . emnlp - main . 679. [Online]. Available: https://aclanthology.org/2023.emnlp-main.679. [72] B. Wang, C. Zhang, Y. Zhang, Y. Chen, and H. Li, “Analyzing and evaluat ing faithfulness in dialogue summarization,” in Proceedings of the 2022 Con ference on Empirical Methods in Natural Language Processing, Y. Goldberg, Z. Kozareva, and Y. Zhang, Eds., Abu Dhabi, United Arab Emirates: Association for Computational Linguistics, Dec. 2022, pp. 4897–4908. doi: 10.18653/v1/ 2022.emnlp- main.325. [Online]. Available: https://aclanthology.org/ 2022.emnlp-main.325. [73] K. Krishna, E. Bransom, B. Kuehl, et al., “LongEval: Guidelines for human eval uation of faithfulness in long-form summarization,” in Proceedings of the 17th Conference of the European Chapter of the Association for Computational Lin guistics, A. Vlachos and I. Augenstein, Eds., Dubrovnik, Croatia: Association for Computational Linguistics, May 2023, pp. 1650–1669. doi: 10.18653/v1/ 2023 . eacl - main . 121. [Online]. Available: https : / / aclanthology . org / 2023.eacl-main.121. 54 [74] C.-Y. Lin, “Rouge: A package for automatic evaluation of summaries,” in Text Summarization Branches Out: Proceedings of the ACL-04 Workshop, Association for Computational Linguistics, 2004, pp. 74–81. [75] E. Durmus, H. He, and M. Diab, “FEQA: A question answering evaluation framework for faithfulness assessment in abstractive summarization,” in Pro ceedings of the 58th Annual Meeting of the Association for Computational Lin guistics, D. Jurafsky, J. Chai, N. Schluter, and J. Tetreault, Eds., Online: Associ ation for Computational Linguistics, Jul. 2020, pp. 5055–5070. doi: 10.18653/ v1/2020.acl-main.454. [Online]. Available: https://aclanthology.org/ 2020.acl-main.454. [76] W. Li, W. Wu, M. Chen, J. Liu, X. Xiao, and H. Wu, “Faithfulness in natural lan guage generation: A systematic survey of analysis, evaluation and optimization methods,” arXiv preprint arXiv:2203.05227, 2022. [77] G. Adams, H.-C. Shing, Q. Sun, C. Winestock, K. McKeown, and N. Elhadad, “Learning to revise references for faithful summarization,” arXiv preprint arXiv:2204.10290, 2022. [78] T. Li, Z. Li, and Y. Zhang, “Improving faithfulness of large language models in summarization via sliding generation and self-consistency,” in Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), 2024, pp. 8804–8817. [79] N. Zhang, Y. Zhang, W. Guo, P. Mitra, and R. Zhang, Famesumm: Investigating and improving faithfulness of medical summarization, 2023. arXiv: 2311.02271 [cs.CL]. [80] Contrastive Learning. [Online]. Available: https : / / paperswithcode . com / task/contrastive-learning. [81] M. Zhang, S. Dou, Z. Wang, and Y. Wu, “Focus-driven contrastive learning for medical question summarization,” in Proceedings of the 29th International Conference on Computational Linguistics, N. Calzolari, C.-R. Huang, H. Kim, et al., Eds., Gyeongju, Republic of Korea: International Committee on Com putational Linguistics, Oct. 2022, pp. 6176–6186. [Online]. Available: https: //aclanthology.org/2022.coling-1.539. 55 [82] Z. Fatima, S. Zardari, M. Fahim, et al., “A novel approach for semantic extractive text summarization,” Applied Sciences, vol. 12, no. 9, p. 4479, 2022. [83] R. Mihalcea and P. Tarau, “Textrank: Bringing order into text,” in Proceedings of the 2004 conference on empirical methods in natural language processing, 2004, pp. 404–411. [84] V. Gulati, D. Kumar, D. E. Popescu, and J. D. Hemanth, “Extractive article summarization using integrated textrank and bm25+ algorithm,” Electronics, vol. 12, no. 2, p. 372, 2023. [85] T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger, and Y. Artzi, “Bertscore: Evalu ating text generation with bert,” arXiv preprint arXiv:1904.09675, 2019. [86] P. Laban, T. Schnabel, P. N. Bennett, and M. A. Hearst, “Summac: Re-visiting nli-based models for inconsistency detection in summarization,” in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021, pp. 8425–8441. [87] R. Flesch, “A new readability yardstick.,” Journal of applied psychology, vol. 32, no. 3, p. 221, 1948. [88] Y. Zha, Y. Yang, R. Li, and Z. Hu, “Alignscore: Evaluating factual consistency with a unified alignment function,” arXiv preprint arXiv:2305.16739, 2023. [89] Z. Abbasiantaeb, Y. Yuan, E. Kanoulas, and M. Aliannejadi, “Let the llms talk: Simulating human-to-human conversational qa via zero-shot llm-to-llm inter actions,” in Proceedings of the 17th ACM International Conference on Web Search and Data Mining, 2024, pp. 8–17. [90] C. X. Yu, C. S. Y. James, and P. H.-L. P. David, “Can llms have a fever? investi gating the effects of temperature on llm security,” [91] A. Dada, M. Bauer, A. B. Contreras, et al., “Clue: A clinical language under standing evaluation for llms,” arXiv preprint arXiv:2404.04067, 2024	en_US
dc.identifier.uri	http://hdl.handle.net/123456789/2376
dc.description	Supervised by Mr. Tareque Mohmud Chowdhury, Assistant Professor, Ms. Farzana Tabassum, Lecturer, Ms. Sabrina Islam, Lecturer, Department of Computer Science and Engineering (CSE) Islamic University of Technology (IUT) Board Bazar, Gazipur, Bangladesh This thesis is submitted in partial fulfillment of the requirement for the degree of Bachelor of Science in Computer Science and Engineering, 2024	en_US
dc.description.abstract	Medical text summarization, particularly for consumer health queries, is a critical area of research due to its potential to enhance healthcare delivery. In the context of online consumer health questions, efficient communication is paramount. The abil ity to condense lengthy and complex medical queries while retaining essential details is crucial for facilitating timely and accurate responses from healthcare profession als. This not only improves patient outcomes but also optimizes the workflow within healthcare services, making it an indispensable tool in modern medical practice. However, a persistent concern in the realm of consumer health questions is the faith fulness of the summarized queries. Preserving the accuracy of information, especially when dealing with complex medical terminology, is of utmost importance. While the primary focus of text summarization has traditionally been on accuracy, the aspect of faithfulness—ensuring that the summary accurately represents the source mate rial—has often been overlooked. Our research aims to address this gap by enhancing the faithfulness of consumer health queries in addition to improving their accuracy. To address these challenges, we are leveraging large language models (LLMs), which have recently shown significant promise in text summarization tasks. Ultimately, our objective is to improve both the faithfulness and accuracy of LLMs in summarizing medical texts. To achieve this, we propose a novel framework that fine-tunes LLMs using domain-specific medical knowledge. This approach aims to balance concise summarization with the precise representation of medical information, ensuring that the essential details are faithfully conveyed.	en_US
dc.language.iso	en	en_US
dc.publisher	Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur-1704, Bangladesh	en_US
dc.subject	Natural Language Processing; Large Language Models; Text Summarization; Consumer Health Questions; Faithfulness	en_US
dc.title	Improving Faithfulness in Medical Text Summarization: An LLM-Based Approach	en_US
dc.type	Thesis	en_US