Text Simplification Aided with Text Summarization

Dewa, Hamdja Bia; Gandega, Abubakar; Zouleiha, Mbouwap Njoya

dc.contributor.author	Dewa, Hamdja Bia
dc.contributor.author	Gandega, Abubakar
dc.contributor.author	Zouleiha, Mbouwap Njoya
dc.date.accessioned	2025-03-10T06:21:52Z
dc.date.available	2025-03-10T06:21:52Z
dc.date.issued	2024-07-08
dc.identifier.citation	[1] T. B. Brown, B. Mann, N. Ryder, et al., “Language models are few-shot learners,” 2020. arXiv: 2005.14165. [2] A. Caines, L. Benedetto, S. Taslimipoor, et al., “On the application of large lan guage models for language teaching and assessment technology,” 2023. arXiv: 2307.08393 [cs.CL]. [Online]. Available: https://arxiv.org/abs/2307. 08393. [3] M. Cemri, T. Çukur, and A. Koç, Unsupervised simplification of legal texts, 2022. arXiv: 2209.00557 [cs.CL]. [Online]. Available: https://arxiv.org/abs/ 2209.00557. [4] L. Cripwell, J. Legrand, and C. Gardent, “Controllable sentence simplification via operation classification,” in Findings of the Association for Computational Linguistics: NAACL 2022, M. Carpuat, M.-C. de Marneffe, and I. V. Meza Ruiz, Eds., Seattle, United States: Association for Computational Linguistics, Jul. 2022, pp. 2091–2103. doi: 10 . 18653 / v1 / 2022 . findings - naacl . 161. [Online]. Available: https://aclanthology.org/2022.findings-naacl.161. [5] L. Cripwell, J. Legrand, and C. Gardent, “Context-aware document simplifica tion,” 2023. arXiv: 2305.06274 [cs.CL]. [Online]. Available: https://arxiv. org/abs/2305.06274. [6] L. Cripwell, J. Legrand, and C. Gardent, “Context-aware document simplifica tion,” 2023. arXiv: 2305.06274. [7] L. Cripwell, J. Legrand, and C. Gardent, “Document-level planning for text sim plification,” in Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, A. Vlachos and I. Augenstein, Eds., Dubrovnik, Croatia: Association for Computational Linguistics, May 2023, pp. 993– 1006. doi: 10.18653/v1/2023.eacl- main.70. [Online]. Available: https: //aclanthology.org/2023.eacl-main.70. [8] L. Cripwell, J. Legrand, and C. Gardent, “Document-level planning for text sim plification,” in Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, A. Vlachos and I. Augenstein, Eds., 28 Dubrovnik, Croatia: Association for Computational Linguistics, May 2023, pp. 993– 1006. doi: 10.18653/v1/2023.eacl- main.70. [Online]. Available: https: //aclanthology.org/2023.eacl-main.70. [9] D. Crystal, Making sense of grammar. Pearson Education, 2004. [10] T. Dadu, K. Pant, S. Nagar, F. A. Barbhuiya, and K. Dey, “Text simplification for comprehension-based question-answering,” 2021. arXiv: 2109.13984 [cs.CL]. [Online]. Available: https://arxiv.org/abs/2109.13984. [11] BERT: Pre-training of Deep Bidirectional Transformers for Language Understand ing, 2019. arXiv: 1810.04805. [12] Y. Dong, “A survey on neural network-based summarization methods,” arXiv preprint arXiv:1804.04589, 2018. [13] Y. Ehara, “Readability and linearity,” in Proceedings of the 35th Pacific Asia Con ference on Language, Information and Computation, K. Hu, J.-B. Kim, C. Zong, and E. Chersoni, Eds., Shanghai, China: Association for Computational Lin gustics, Nov. 2021, pp. 503–512. [Online]. Available: https://aclanthology. org/2021.paclic-1.53. [14] Falconsai, “Text summarization,” Jun. 2024. [Online]. Available: https : / / huggingface.co/Falconsai/text_summarization. [15] K. Filippova and M. Strube, “Sentence compression for automatic subtitling in real-time news broadcasting,” in Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics, 2008, pp. 97–100. [16] I. Kandhro, F. ALI, A. KEHAR, R. S. A. LARIK, M. S. A. SHEIKHA, and J. TAUFIQ, “Psat-based sentiment analysis: For text and data mining,” Journal of Tianjin University Science and Technology, vol. 55, no. 4, pp. 576–591, 2022. [17] M. Lewis, Y. Liu, N. Goyal, et al., “BART: Denoising sequence-to-sequence pre training for natural language generation, translation, and comprehension,” in Proceedings of the 58th Annual Meeting of the Association for Computational Lin guistics, D. Jurafsky, J. Chai, N. Schluter, and J. Tetreault, Eds., Online: Associ ation for Computational Linguistics, Jul. 2020, pp. 7871–7880. doi: 10.18653/ v1/2020.acl-main.703. [Online]. Available: https://aclanthology.org/ 2020.acl-main.703. [18] C.-Y. Lin and E. Hovy, “Automatic evaluation of summaries using n-gram co occurrence statistics,” in Proceedings of the 2003 Conference of the North Amer ican Chapter of the Association for Computational Linguistics on Human Lan 29 guage Technology-Volume 1, Association for Computational Linguistics, 2003, pp. 71–78. [19] C.-Y. Lin and F. Och, “Looking for a few good metrics: Rouge and its evalua tion,” in Ntcir workshop, 2004. [20] S. linova, X. Zhou, M. Jaggi, C. Eickhoff, and S. A. Bahrainian, “SIMSUM: Document level text simplification via simultaneous summarization,” in Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Vol ume 1: Long Papers), A. Rogers, J. Boyd-Graber, and N. Okazaki, Eds., Toronto, Canada: Association for Computational Linguistics, Jul. 2023, pp. 9927–9944. doi: 10 . 18653 / v1 / 2023 . acl - long . 552. [Online]. Available: https : / / aclanthology.org/2023.acl-long.552. [21] H. P. Luhn, “The automatic creation of literature abstracts,” IBM Journal of Re search and Development, vol. 2, no. 2, pp. 159–165, 1958. [22] L. Makhmutova, Medical texts simplification, May 2024. [Online]. Available: https://github.com/LiliyaMakhmutova/medical_texts_simplification. [23] L. Makhmutova, G. D. Salton, F. Pérez-Téllez, and R. J. Ross, “Automated med ical text simplification for enhanced patient access.,” in BIOSTEC (2), 2024, pp. 208–218. [24] M. Marcus, “New trends in natural language processing: Statistical natural lan guage processing.,” Proceedings of the National Academy of Sciences, vol. 92, no. 22, pp. 10 052–10 059, 1995. [25] A. Mastropaolo, S. Scalabrino, N. Cooper, et al., “Studying the usage of text to-text transfer transformer to support code-related tasks,” in 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), IEEE, 2021, pp. 336– 347. [26] R. Mihalcea and P. Tarau, “Textrank: Bringing order into text,” in Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, 2004, pp. 404–411. [27] R. Nallapati, B. Zhou, C. Gulcehre, and B. Xiang, “Abstractive text summariza tion using sequence-to-sequence rnns and beyond,” arXiv preprint arXiv:1602.06023, 2016. [28] C. Napoles, M. R. Gormley, and B. Van Durme, “Automatically evaluating text simplification systems,” Transactions of the Association for Computational Lin guistics, vol. 4, pp. 401–415, 2016. 30 [29] S. Nisioi, S. Štajner, S. P. Ponzetto, and L. P. Dinu, “Exploring neural text sim plification models,” in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), R. Barzilay and M.-Y. Kan, Eds., Vancouver, Canada: Association for Computational Linguistics, Jul. 2017, pp. 85–91. doi: 10 . 18653 / v1 / P17 - 2014. [Online]. Available: https : //aclanthology.org/P17-2014. [30] K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, “Bleu: A method for automatic evaluation of machine translation,” in Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, P. Isabelle, E. Charniak, and D. Lin, Eds., Philadelphia, Pennsylvania, USA: Association for Computational Linguistics, Jul. 2002, pp. 311–318. doi: 10.3115/1073083.1073135. [Online]. Available: https://aclanthology.org/P02-1040. [31] J. Qiang, Y. Li, Y. Zhu, Y. Yuan, and X. Wu, “Lsbert: A simple framework for lexical simplification,” 2020. arXiv: 2006.14939. [32] A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever, “Improving language understanding by generative pre-training,” 2018. [Online]. Available: https : / / www . bibsonomy . org / bibtex / 273ced32c0d4588eb95b6986dc2c8147c / jonaskaiser. [33] A. M. Rush, S. Chopra, and J. Weston, “A neural attention model for abstractive sentence summarization,” in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Association for Computational Lin guistics, 2015, pp. 379–389. [34] M. Santaholma, “Grammar sharing techniques for rule-based multilingual nlp systems,” in Proceedings of the 16th Nordic Conference of Computational Lin guistics (NODALIDA 2007), 2007, pp. 253–260. [35] A. See, P. J. Liu, and C. D. Manning, “Get to the point: Summarization with pointer-generator networks,” in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, 2017, pp. 1073–1083. [36] A. Siddharthan, “Syntactic simplification and text cohesion,” Research on Lan guage & Computation, vol. 1, no. 2, pp. 127–152, 2002. [37] A. Siddharthan, “A survey of research on text simplification,” ITL-International Journal of Applied Linguistics, vol. 165, no. 2, pp. 259–298, 2014. [38] S. Štajner, H. Saggion, D. Ferrés, et al., Eds., Proceedings of the Workshop on Text Simplification, Accessibility, and Readability (TSAR-2022), Abu Dhabi, United Arab Emirates (Virtual): Association for Computational Linguistics, Dec. 2022. [Online]. Available: https://aclanthology.org/2022.tsar-1.0. 31 [39] R. Sun, H. Jin, and X. Wan, “Document-level text simplification: Dataset, crite ria and baseline,” in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, M.-F. Moens, X. Huang, L. Specia, and S. W.-t. Yih, Eds., Online and Punta Cana, Dominican Republic: Association for Com putational Linguistics, Nov. 2021, pp. 7997–8013. doi: 10 . 18653 / v1 / 2021 . emnlp- main.630. [Online]. Available: https://aclanthology.org/2021. emnlp-main.630. [40] R. Sun, Z. Yang, and X. Wan, “Exploiting summarization data to help text sim plification,” 2023. arXiv: 2302.07124 [cs.CL]. [Online]. Available: https:// arxiv.org/abs/2302.07124. [41] O. Tas and F. Kiyani, “A survey automatic text summarization,” PressAcademia Procedia, vol. 5, no. 1, pp. 205–213, 2007. [42] Z. W. Taylor, M. H. Chu, and J. J. Li, “Text simplification of college admis sions instructions: A professionally simplified and verified corpus,” 2022. arXiv: 2209.04529 [cs.CL]. [Online]. Available: https://arxiv.org/abs/2209. 04529. [43] S. Wubben, A. van den Bosch, and E. Krahmer, “Sentence simplification by monolingual machine translation,” in Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), H. Li, C.-Y. Lin, M. Osborne, G. G. Lee, and J. C. Park, Eds., Jeju Island, Korea: Associ ation for Computational Linguistics, Jul. 2012, pp. 1015–1024. [Online]. Avail able: https://aclanthology.org/P12-1107. [44] W. Xu, C. Napoles, E. Pavlick, Q. Chen, and C. Callison-Burch, “Optimizing sta tistical machine translation for text simplification,” Transactions of the Associa tion for Computational Linguistics, vol. 4, L. Lee, M. Johnson, and K. Toutanova, Eds., pp. 401–415, 2016. doi: 10 . 1162 / tacl _ a _ 00107. [Online]. Available: https://aclanthology.org/Q16-1029. [45] W. Yuan, G. Neubig, and P. Liu, “Bartscore: Evaluating generated text as text generation,” 2021. arXiv: 2106 . 11520 [cs.CL]. [Online]. Available: https : //arxiv.org/abs/2106.11520. [46] J. Zhang, Y. Zhao, M. Saleh, and P. Liu, “PEGASUS: Pre-training with extracted gap-sentences for abstractive summarization,” in Proceedings of the 37th Inter national Conference on Machine Learning, H. D. III and A. Singh, Eds., ser. Pro ceedings of Machine Learning Research, vol. 119, PMLR, 13–18 Jul 2020, pp. 11 328– 11 339. [Online]. Available: https://proceedings.mlr.press/v119/zhang20ae. html. 32 [47] T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger, and Y. Artzi, “Bertscore: Eval uating text generation with bert,” 2020. arXiv: 1904.09675 [cs.CL]. [Online]. Available: https://arxiv.org/abs/1904.09675. [48] X. Zhang and M. Lapata, “Sentence simplification with deep reinforcement learning,” in Proceedings of the 2017 Conference on Empirical Methods in Natu ral Language Processing, M. Palmer, R. Hwa, and S. Riedel, Eds., Copenhagen, Denmark: Association for Computational Linguistics, Sep. 2017, pp. 584–594. doi: 10.18653/v1/D17-1062. [Online]. Available: https://aclanthology. org/D17-1062. [49] Z. Zhu, D. Bernhard, and I. Gurevych, “Monolingual machine translation for paraphrase generation,” in Proceedings of the 23rd International Conference on Computational Linguistics: Posters, Association for Computational Linguistics, 2010, pp. 1353–1361.	en_US
dc.identifier.uri	http://hdl.handle.net/123456789/2371
dc.description	Supervised by Dr. Md. Azam Hossain, Associate Professor, Mr. Md. Shihab Shariar, Lecturer, Department of Computer Science and Engineering (CSE) Islamic University of Technology (IUT) Board Bazar, Gazipur, Bangladesh This thesis is submitted in partial fulfillment of the requirement for the degree of Bachelor of Science in Computer Science and Engineering, 2024	en_US
dc.description.abstract	Effective text simplification strategies are becoming more and more necessary in the constantly changing field of Natural Language Processing(NLP), especially to help audiences from a variety of backgrounds understand each other better. In order to improve context awareness in text simplification, this paper investigates the integra tion of text summarization as a critical tool. Through the utilization of sophisticated algorithms for text summarization and simplification, our goal is to enhance the read ability and accessibility of intricate textual material. There are several steps in the propossed procedure. First, the most important information from the original text is extracted, capturing the essence of the content, using advanced text summarization techniques. The resulting summary then acts as the basis for the process of simpli fying the text. The simplified text is designed to be easier to read while maintaining the main ideas through a thorough examination of vocabulary, linguistic structures, and context. Our findings highlight the mutually beneficial relationship between text summarization and simplification, demonstrating how the former serves as a mech anism of guidance for the latter. Through the use of context-aware summarization, our approach guarantees the preservation of important textual elements and relation ships, resulting in a condensed version that retains the context and intended meaning. This paper introduces a new task of document-level text simplification where we first summarize a document, then concatenating the summarized text together with the orginal document and finally, we simplified the conatenated document in order to improve the simplification of a document by conserving the context of the original document. Therefore, the combination of these two methods improves textual con tent’s overall clarity and creates new opportunities for future study at the nexus of accessibility and natural language processing	en_US
dc.language.iso	en	en_US
dc.publisher	Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur-1704, Bangladesh	en_US
dc.subject	text simplification summarization concatenate dataset	en_US
dc.title	Text Simplification Aided with Text Summarization	en_US
dc.type	Thesis	en_US