Dialog Generation with Conversational Agent in the Context of Task-Oriented using a Transformer Architecture

Petouo, Faysal Mounir; Arafat, Yaya Issa

dc.contributor.author	Petouo, Faysal Mounir
dc.contributor.author	Arafat, Yaya Issa
dc.date.accessioned	2024-01-18T06:02:01Z
dc.date.available	2024-01-18T06:02:01Z
dc.date.issued	2023-05-30
dc.identifier.citation	[1] Yohan Lee. Improving end-to-end task-oriented dialog system with a simple auxiliary task. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 1296–1303, 2021. [2] Wenqiang Lei, Xisen Jin, Min-Yen Kan, Zhaochun Ren, Xiangnan He, and Dawei Yin. Sequicity: Simplifying task-oriented dialogue systems with single sequence-to-sequence architectures. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1437–1447, 2018. [3] Yichi Zhang, Zhijian Ou, and Zhou Yu. Task-oriented dialog systems that consider multiple appropriate responses under the same context. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 9604–9611, 2020. [4] Xiaodong Liu, Pengcheng He, Weizhu Chen, and Jianfeng Gao. Multi task deep neural networks for natural language understanding. arXiv preprint arXiv:1901.11504, 2019. [5] Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J Liu, et al. Explor ing the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., 21(140):1–67, 2020. [6] Christian Buck, Kenneth Heafield, and Bas Van Ooyen. N-gram counts and language models from the common crawl. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), pages 3579–3584, 2014. [7] Trieu H Trinh and Quoc V Le. A simple method for commonsense 42 reasoning. arXiv preprint arXiv:1806.02847, 2018. [8] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017. [9] Peter Shaw, Jakob Uszkoreit, and Ashish Vaswani. Self-attention with relative position representations. arXiv preprint arXiv:1803.02155, 2018. [10] Cheng-Zhi Anna Huang, Ashish Vaswani, Jakob Uszkoreit, Noam Shazeer, Ian Simon, Curtis Hawthorne, Andrew M Dai, Matthew D Hoff man, Monica Dinculescu, and Douglas Eck. Music transformer. arXiv preprint arXiv:1809.04281, 2018. [11] M Guo, J Ainslie, D Uthus, S Ontanon, J Ni, YH Sung, and Y Yang. Longt5: Efficient text-to-text transformer for long sequences (2021). URL https://arxiv. org/abs/2112.07916. [12] Jingqing Zhang, Yao Zhao, Mohammad Saleh, and Peter Liu. Pegasus: Pre-training with extracted gap-sentences for abstractive summarization. In International Conference on Machine Learning, pages 11328–11339. PMLR, 2020. [13] Joshua Ainslie, Santiago Onta˜n´on, Chris Alberti, Philip Pham, Anirudh Ravula, and Sumit Sanghai. Etc: Encoding long and structured data in transformers. 2020. [14] Pawe l Budzianowski, Tsung-Hsien Wen, Bo-Hsiang Tseng, Inigo Casanueva, Stefan Ultes, Osman Ramadan, and Milica Gaˇsi´c. Multiwoz– a large-scale multi-domain wizard-of-oz dataset for task-oriented dia logue modelling. arXiv preprint arXiv:1810.00278, 2018. [15] Mihail Eric, Rahul Goel, Shachi Paul, Adarsh Kumar, Abhishek Sethi, Peter Ku, Anuj Kumar Goyal, Sanchit Agarwal, Shuyang Gao, and Dilek Hakkani-Tur. Multiwoz 2.1: A consolidated multi-domain dia logue dataset with state corrections and state tracking baselines. arXiv preprint arXiv:1907.01669, 2019. [16] Xiaoxue Zang, Abhinav Rastogi, Srinivas Sunkara, Raghav Gupta, Jian 43 guo Zhang, and Jindong Chen. Multiwoz 2.2: A dialogue dataset with additional annotation corrections and state tracking baselines. arXiv preprint arXiv:2007.12720, 2020. [17] Mandar Joshi, Danqi Chen, Yinhan Liu, Daniel S Weld, Luke Zettle moyer, and Omer Levy. Spanbert: Improving pre-training by represent ing and predicting spans. Transactions of the Association for Computa tional Linguistics, 8:64–77, 2020. [18] Zhaojiang Lin, Andrea Madotto, Genta Indra Winata, and Pascale Fung. Mintl: Minimalist transfer learning for task-oriented dialogue systems. arXiv preprint arXiv:2009.12005, 2020. [19] Tiancheng Zhao and Maxine Eskenazi. Towards end-to-end learning for dialog state tracking and management using deep reinforcement learning. arXiv preprint arXiv:1606.02560, 2016. [20] Steve Young, Milica Gaˇsi´c, Blaise Thomson, and Jason D Williams. Pomdp-based statistical spoken dialog systems: A review. Proceedings of the IEEE, 101(5):1160–1179, 2013. [21] Wenhu Chen, Jianshu Chen, Pengda Qin, Xifeng Yan, and William Yang Wang. Semantically conditioned dialog response generation via hierarchi cal disentangled self-attention. arXiv preprint arXiv:1905.12866, 2019. [22] Jian-Guo Zhang, Kazuma Hashimoto, Chien-Sheng Wu, Yao Wan, Philip S Yu, Richard Socher, and Caiming Xiong. Find or classify? dual strategy for slot-value predictions on multi-domain dialog state tracking. arXiv preprint arXiv:1910.03544, 2019. [23] Jianhong Wang, Yuan Zhang, Tae-Kyun Kim, and Yunjie Gu. Modelling hierarchical structure between dialogue policy and natural language gen erator with option framework for task-oriented dialogue system. arXiv preprint arXiv:2006.06814, 2020. [24] Ehsan Hosseini-Asl, Bryan McCann, Chien-Sheng Wu, Semih Yavuz, and Richard Socher. A simple language model for task-oriented dialogue. Advances in Neural Information Processing Systems, 33:20179–20191, 2020. 44 [25] Donghoon Ham, Jeong-Gwan Lee, Youngsoo Jang, and Kee-Eung Kim. End-to-end neural pipeline for goal-oriented dialogue systems using gpt 2. In Proceedings of the 58th annual meeting of the association for com putational linguistics, pages 583–592, 2020. [26] Baolin Peng, Chunyuan Li, Jinchao Li, Shahin Shayandeh, Lars Liden, and Jianfeng Gao. Soloist: Few-shot task-oriented dialog with a single pretrained auto-regressive model. arXiv preprint arXiv:2005.05298, 3, 2020. [27] Ronan Collobert, Jason Weston, L´eon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa. Natural language processing (almost) from scratch. Journal of machine learning research, 12(ARTICLE):2493– 2537, 2011. [28] Samuel R Bowman, Gabor Angeli, Christopher Potts, and Christopher D Manning. A large annotated corpus for learning natural language infer ence. arXiv preprint arXiv:1508.05326, 2015. [29] Max Glockner, Vered Shwartz, and Yoav Goldberg. Breaking nli sys tems with sentences that require simple lexical inferences. arXiv preprint arXiv:1805.02266, 2018. [30] Han Guo, Ramakanth Pasunuru, and Mohit Bansal. Soft layer-specific multi-task summarization with entailment and question generation. arXiv preprint arXiv:1805.11004, 2018. [31] Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, and Larry Heck. Learning deep structured semantic models for web search using clickthrough data. In Proceedings of the 22nd ACM international conference on Information & Knowledge Management, pages 2333–2338, 2013. [32] Tushar Khot, Ashish Sabharwal, and Peter Clark. Scitail: A textual entailment dataset from science question answering. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32, 2018. [33] Xiaodong Liu, Kevin Duh, and Jianfeng Gao. Stochastic answer networks for natural language inference. arXiv preprint arXiv:1804.07888, 2018. 45 [34] Yu Zhang and Qiang Yang. A survey on multi-task learning. IEEE Transactions on Knowledge and Data Engineering, 34(12):5586–5609, 2021. [35] Minh-Thang Luong, Quoc V Le, Ilya Sutskever, Oriol Vinyals, and Lukasz Kaiser. Multi-task sequence to sequence learning. arXiv preprint arXiv:1511.06114, 2015. [36] John F Kelley. An iterative design methodology for user-friendly natural language office information applications. ACM Transactions on Infor mation Systems (TOIS), 2(1):26–41, 1984. [37] Liliang Ren, Kaige Xie, Lu Chen, and Kai Yu. Towards universal dia logue state tracking. arXiv preprint arXiv:1810.09587, 2018.	en_US
dc.identifier.uri	http://hdl.handle.net/123456789/2055
dc.description	Supervised by Prof. Dr. Md. Kamrul Hasan, Co-Supervisor, Dr. Hasan Mahmud, Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur-1704, Bangladesh	en_US
dc.description.abstract	The use of conversational agents has become increasingly popular in recent years due to their ability to mimic human-like interactions in Human Com puter Interaction (HCI) and provide personalized assistance to users. How ever, creating effective dialogues between humans and conversational agents remains a challenging task, particularly in the context of task-oriented ap plications. This is because such applications require agents to understand complex user requests and generate appropriate responses that take into ac count the user’s goals, preferences, and constraints.To address this challenge, we propose to adapt the LongT5 (Long Text-To-Text-Transfer Transformer) architecture, a transformer-based language processing model well known for its performance in a lot of Natural Language Processing (NLP) tasks. Then, to explore the use of the new proposed model named MegaT for generating task-oriented dialogues between conversational agents and human user. This involves designing and implementing a task-oriented conversational agent trained on annotated dialogues related to specific tasks. The agent’s per formance will be evaluated using metrics such as belief accuracy, belief loss, response accuracy, and response loss. The results have been analyzed to identify the strengths and weaknesses of the T5 transformer, the current state-of-the-art model in task-oriented dialogue generation . Experimental results demonstrate that MegaT outperforms the T5-based agent in terms of generating accurate, fluent, and coherent responses to user queries, as well as handling longer sequences of text and producing more informative and engag ing responses. We also found that our proposed Transient Global attention for task-oriented dialogue systems produce better results than the local at tention mechanism used in LongT5 on MultiWoz 2.2 dataset. The thesis aims to contribute to the development of more effective conversational agents by 1 leveraging the LongT5 model for generating high-quality task-oriented dia logues. This Study provides insights into the use of this recent transformer model and paves the way for further advancements in the field of dialogue generation with conversational agents. . Furthermore, it opens new avenues for future research in the field of dialogue generation with conversational agents.	en_US
dc.language.iso	en	en_US
dc.publisher	Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur-1704, Bangladesh	en_US
dc.subject	natural language processing; conversational agents; task-oriented dialogue; longer sequences; models; transformers; HCI.	en_US
dc.title	Dialog Generation with Conversational Agent in the Context of Task-Oriented using a Transformer Architecture	en_US
dc.type	Thesis	en_US