Development Of An Explainable Natural Language Query Driven Data Visualization System

Ashmafee, Md. Hamjajul

dc.contributor.author	Ashmafee, Md. Hamjajul
dc.date.accessioned	2023-04-27T08:45:17Z
dc.date.available	2023-04-27T08:45:17Z
dc.date.issued	2022-05-30
dc.identifier.citation	[1] S. Fu, K. Xiong, X. Ge, S. Tang, W. Chen, and Y. Wu, “Quda: Natural Language Queries for Visual Data Analytics,” CoRR, vol. abs/2005.03257, 2020. [Online]. Available: https://arxiv.org/abs/2005.03257 [2] A. Satyanarayan, D. Moritz, K. Wongsuphasawat, and J. Heer, “Vega-Lite: A Grammar of Interactive Graphics,” IEEE Transactions on Visualization and Computer Graphics, vol. 23, no. 1, pp. 341–350, 2017. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/7539624 [3] C. Liu, Y. Han, R. Jiang, and X. Yuan, “ADVISor: Automatic Visualization Answer for Natural-Language Question on Tabular Data,” in 2021 IEEE 14th Pacific Visualization Symposium (PacificVis). Tianjin, China: IEEE, 2021, pp. 11–20. [Online]. Available: https://ieeexplore.ieee.org/abstract/ document/9438784 [4] M. T. Ribeiro, S. Singh, and C. Guestrin, “” why should i trust you?” explaining the predictions of any classifier,” in Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 2016, pp. 1135–1144. [5] A. Narechania, A. Srinivasan, and J. Stasko, “NL4DV: A Toolkit for Generating Analytic Specifications for Data Visualization from Natural Language Queries,” IEEE Transactions on Visualization and Computer Graphics, vol. 27, no. 2, pp. 369–379, 2021. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9222342 [6] I. Chowdhury, A. Moeid, E. Hoque, M. A. Kabir, M. S. Hossain, and M. M. Islam, “MIVA: Multimodal Interactions for Facilitating Visual Analysis with Multiple Coordinated Views,” in 2020 24th International Conference Information Visualisation (IV). Melbourne, Australia: IEEE, 2020, pp. 714–717. [Online]. Available: https://ieeexplore.ieee.org/abstract/ document/9373232 49 References 50 [7] M. T. Islam, M. R. Islam, S. Akter, and M. Kawser, “Designing Dashboard for Exploring Tourist Hotspots in Bangladesh,” in 2020 23rd International Conference on Computer and Information Technology (ICCIT). Dhaka, Bangladesh: IEEE, 2020, pp. 1–6. [Online]. Available: https: //ieeexplore.ieee.org/abstract/document/9392708 [8] M. R. Islam, S. Akter, M. R. Ratan, A. R. M. Kamal, and G. Xu, “Deep Visual Analytics (DVA): Applications, Challenges and Future Directions,” Human-Centric Intelligent Systems, vol. 1, pp. 3–17, 2021. [Online]. Available: https://www.atlantis-press.com/journals/hcis/125959056/view [9] J. Zerafa, M. R. Islam, A. Kabir, and G. Xu, “ExTraVis: Exploration of Traffic Incidents Using Visual Interactive System,” in 25th International Conference Information Visualisation (IV 2021), IEEE, Institute of Electrical and Electronics Engineers. Sydney, Australia: IEEE, 2021. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9582720 [10] B. Lee, P. Isenberg, N. H. Riche, and S. Carpendale, “Beyond mouse and keyboard: Expanding design considerations for information visualization interactions,” IEEE Transactions on Visualization and Computer Graphics, vol. 18, no. 12, pp. 2689–2698, dec 2012. [Online]. Available: https: //doi.org/10.1109%2Ftvcg.2012.204 [11] R. Amar, J. Eagan, and J. Stasko, “Low-level components of analytic activity in information visualization,” in IEEE Symposium on Information Visualization, 2005. INFOVIS 2005. Minneapolis, MN, USA: IEEE, 2005, pp. 111–117. [Online]. Available: https://ieeexplore.ieee.org/abstract/ document/1532136 [12] A. Satyanarayan, R. Russell, J. Hoffswell, and J. Heer, “Reactive Vega: A Streaming Dataflow Architecture for Declarative Interactive Visualization,” IEEE Transactions on Visualization and Computer Graphics, vol. 22, no. 1, pp. 659–668, 2016. [Online]. Available: https://ieeexplore.ieee.org/abstract/ document/7192704 [13] M. Honnibal and I. Montani, “spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing,” 2017, to appear. [14] V. Setlur, M. Tory, and A. Djalali, “Inferencing Underspecified Natural Language Utterances in Visual Analysis,” in Proceedings of the 24th References 51 International Conference on Intelligent User Interfaces, ser. IUI ’19. New York, NY, USA: Association for Computing Machinery, 2019, p. 40–51. [Online]. Available: https://dl.acm.org/doi/abs/10.1145/3301275.3302270 [15] D. H. Kim, E. Hoque, and M. Agrawala, “Answering Questions about Charts and Generating Visual Explanations,” in Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. New York, NY, USA: Association for Computing Machinery, 2020, pp. 1–13. [Online]. Available: https://dl.acm.org/doi/abs/10.1145/3313831.3376467 [16] A. Kumar, S. Dikshit, and V. H. C. Albuquerque, “Explainable Artificial Intelligence for Sarcasm Detection in Dialogues,” Wireless Communications and Mobile Computing, vol. 2021, 2021. [Online]. Available: https://www.hindawi.com/journals/wcmc/2021/2939334/ [17] M. R. Islam, S. Liu, X. Wang, and G. Xu, “Deep learning for misinformation detection on online social networks: a survey and new perspectives,” Social Network Analysis and Mining, vol. 10, no. 1, pp. 1–20, 2020. [Online]. Available: https://link.springer.com/article/10.1007/s13278-020-00696-x [18] K. Cox, R. E. Grinter, S. L. Hibino, L. J. Jagadeesan, and D. Mantilla, “A Multi-Modal Natural Language Interface to an Information Visualization Environment,” International Journal of Speech Technology, vol. 4, no. 3, pp. 297–314, 2001. [Online]. Available: https://link.springer.com/article/10. 1023/A:1011368926479 [19] T. Gao, M. Dontcheva, E. Adar, Z. Liu, and K. G. Karahalios, “DataTone: Managing Ambiguity in Natural Language Interfaces for Data Visualization,” in Proceedings of the 28th Annual ACM Symposium on User Interface Software & Technology, ser. UIST ’15. New York, NY, USA: Association for Computing Machinery, 2015, p. 489–500. [Online]. Available: https://dl.acm.org/doi/abs/10.1145/2807442.2807478 [20] Y. Sun, J. Leigh, A. Johnson, and S. Lee, “Articulate: A Semiautomated Model for Translating Natural Language Queries into Meaningful Visualizations,” in Smart Graphics, R. Taylor, P. Boulanger, A. Kr¨uger, and P. Olivier, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010, pp. 184–195. [Online]. Available: https://rd.springer.com/chapter/10.1007/ 978-3-642-13544-6 18 [21] B. Yu and C. T. Silva, “FlowSense: A Natural Language Interface for Visual Data Exploration within a Dataflow System,” IEEE Transactions References 52 on Visualization and Computer Graphics, vol. 26, no. 1, pp. 1–11, 2020. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/8807265 [22] V. Setlur, S. E. Battersby, M. Tory, R. Gossweiler, and A. X. Chang, “Eviza: A natural language interface for visual analysis,” in Proceedings of the 29th Annual Symposium on User Interface Software and Technology, ser. UIST ’16. New York, NY, USA: Association for Computing Machinery, 2016, p. 365–377. [Online]. Available: https://dl.acm.org/doi/abs/10.1145/2984511.2984588 [23] E. Hoque, V. Setlur, M. Tory, and I. Dykeman, “Applying Pragmatics Principles for Interaction with Visual Analytics,” IEEE Transactions on Visualization and Computer Graphics, vol. 24, no. 1, pp. 309–318, 2018. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/8019833 [24] L. Shen, E. Shen, Y. Luo, X. Yang, X. Hu, X. Zhang, Z. Tai, and J. Wang, “Towards natural language interfaces for data visualization: A survey,” IEEE Transactions on Visualization and Computer Graphics, pp. 1–1, 2022. [Online]. Available: https://doi.org/10.1109%2Ftvcg.2022.3148007 [25] J. Hu, “12explainable deep learning for natural language processing,” Ph.D. dissertation, KTH Royal Institute of Technology, 2018. [Online]. Available: http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-254886 [26] T. Spinner, U. Schlegel, H. Sch¨afer, and M. El-Assady, “explAIner: A Visual Analytics Framework for Interactive and Explainable Machine Learning,” IEEE Transactions on Visualization and Computer Graphics, vol. 26, no. 1, pp. 1064–1074, 2020. [Online]. Available: https: //ieeexplore.ieee.org/abstract/document/8807299 [27] M. Danilevsky, K. Qian, R. Aharonov, Y. Katsis, B. Kawas, and P. Sen, “A Survey of the State of Explainable AI for Natural Language Processing,” in Proceedings of the 1st Conference of the AsiaPacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing. Suzhou, China: Association for Computational Linguistics, Dec. 2020, pp. 447–459. [Online]. Available: https://aclanthology.org/2020.aacl-main.46 [28] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pretraining of Deep Bidirectional Transformers for Language Understanding,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, References 53 Volume 1 (Long and Short Papers). Minneapolis, Minnesota: Association for Computational Linguistics, Jun. 2019, pp. 4171–4186. [Online]. Available: https://aclanthology.org/N19-1423 [29] A. Kumar Sharma, S. Hazra, and A. Professor, “10. Application of Deep Learning Techniques for Text Classification on Small Datasets,” International Journal of Engineering Science and Computing, vol. 8, no. 4, pp. 17 212–17 213, 2018. [Online]. Available: http://ijesc.org/ [30] H. Liu, Q. Yin, and W. Y. Wang, “Towards Explainable NLP: A Generative Explanation Framework for Text Classification,” Tech. Rep. [Online]. Available: https://www.pcmag.com/ [31] X. Zhang, J. Zhao, and Y. LeCun, “Character-level convolutional networks for text classification,” Advances in neural information processing systems, vol. 28, 2015. [32] M. Danilevsky, K. Qian, R. Aharonov, Y. Katsis, and P. Sen, “A Survey of the State of Explainable AI for Natural Language Processing,” Tech. Rep. [Online]. Available: https://xainlp2020.github.io/xainlp/ [33] M. A. Yalcin, N. Elmqvist, and B. B. Bederson, “Keshif: Rapid and expressive tabular data exploration for novices,” IEEE Transactions on Visualization and Computer Graphics, vol. 24, no. 8, pp. 2339–2352, aug 2018. [Online]. Available: https://doi.org/10.1109%2Ftvcg.2017.2723393 [34] M. Vartak, S. Madden, A. Parameswaran, and N. Polyzotis, “SeeDB,” Proceedings of the VLDB Endowment, vol. 7, no. 13, pp. 1581–1584, aug 2014. [Online]. Available: https://doi.org/10.14778%2F2733004.2733035 [35] K. Wongsuphasawat, D. Moritz, A. Anand, J. Mackinlay, B. Howe, and J. Heer, “Voyager: Exploratory analysis via faceted browsing of visualization recommendations,” IEEE Transactions on Visualization and Computer Graphics, vol. 22, no. 1, pp. 649–658, jan 2016. [Online]. Available: https://doi.org/10.1109%2Ftvcg.2015.2467191 [36] P. L. M. Olivia Nix, “Now in beta: Ask questions of your data with natural language, schedule your tableau prep flows,” Oct 2018. [Online]. Available: https://www.tableau.com/ [37] “Data visualization: Microsoft power bi.” [Online]. Available: https: //powerbi.microsoft.com/en-us/ References 54 [38] F. B. Viegas, M. Wattenberg, F. van Ham, J. Kriss, and M. McKeon, “ManyEyes: a site for visualization at internet scale,” IEEE Transactions on Visualization and Computer Graphics, vol. 13, no. 6, pp. 1121–1128, nov 2007. [Online]. Available: https://doi.org/10.1109%2Ftvcg.2007.70577 [39] C. Stolte and P. Hanrahan, “Polaris: a system for query, analysis and visualization of multi-dimensional relational databases,” in IEEE Symposium on Information Visualization 2000. INFOVIS 2000. Proceedings. IEEE Comput. Soc. [Online]. Available: https://doi.org/10.1109%2Finfvis. 2000.885086 [40] A. Satyanarayan and J. Heer, “Lyra: An interactive visualization design environment,” Computer Graphics Forum, vol. 33, no. 3, pp. 351–360, jun 2014. [Online]. Available: https://doi.org/10.1111%2Fcgf.12391 [41] D. Ren, T. Hollerer, and X. Yuan, “iVisDesigner: Expressive interactive design of information visualizations,” IEEE Transactions on Visualization and Computer Graphics, vol. 20, no. 12, pp. 2092–2101, dec 2014. [Online]. Available: https://doi.org/10.1109%2Ftvcg.2014.2346291 [42] M. Bostock, V. Ogievetsky, and J. Heer, “D3 Data-Driven Documents,” IEEE Transactions on Visualization and Computer Graphics, vol. 17, no. 12, pp. 2301–2309, 2011. [Online]. Available: https://ieeexplore.ieee.org/abstract/ document/6064996 [43] L. Wilkinson, The Grammar of Graphics. Springer New York, 1999. [Online]. Available: https://doi.org/10.1007%2F978-1-4757-3100-2 [44] H. Wickham, “ggplot2,” Wiley interdisciplinary reviews: computational statistics, vol. 3, no. 2, pp. 180–185, 2011. [45] W. Chang and H. Wickham, “ggvis: Interactive grammar of graphics,” R package version0, vol. 4, 2016. [46] K. Cox, R. E. Grinter, S. L. Hibino, L. J. Jagadeesan, and D. Mantilla, “A multi-modal natural language interface to an information visualization environment,” International Journal of Speech Technology, vol. 4, no. 3, pp. 297–314, 2001. [47] T. Ball, C. Colby, P. Danielsen, L. J. Jagadeesan, R. Jagadeesan, K. L¨aufer, P. Mataga, and K. Rehor, “Sisl: Several interfaces, single logic,” International Journal of Speech Technology, vol. 3, no. 2, pp. 93–108, 2000. References 55 [48] K. Dhamdhere, K. S. McCurley, R. Nahmias, M. Sundararajan, and Q. Yan, “Analyza: Exploring data with conversation,” in Proceedings of the 22nd International Conference on Intelligent User Interfaces. ACM, mar 2017, pp. 493–504. [Online]. Available: https://doi.org/10.1145%2F3025171.3025227 [49] “Ibm analytics.” [Online]. Available: https://www.ibm.com/au-en/analytics [50] “Wolfram: Alpha.” [Online]. Available: https://www.wolframalpha.com/ [51] “Thoughtspot; ai-driven analytics,” Apr 2022. [Online]. Available: https: //www.thoughtspot.com/ [52] E. Hoque, V. Setlur, M. Tory, and I. Dykeman, “Applying pragmatics principles for interaction with visual analytics,” IEEE Transactions on Visualization and Computer Graphics, vol. 24, no. 1, pp. 309–318, 2018. [53] A. Srinivasan and J. Stasko, “Orko: Facilitating Multimodal Interaction for Visual Exploration and Analysis of Networks,” IEEE Transactions on Visualization and Computer Graphics, vol. 24, no. 1, pp. 511–521, 2018. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/8019860 [54] Y.-H. Kim, B. Lee, A. Srinivasan, and E. K. Choe, “Data@ hand: Fostering visual exploration of personal data on smartphones leveraging speech and touch interaction,” in Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, 2021, pp. 1–17. [55] A. Srinivasan, B. Lee, N. H. Riche, S. M. Drucker, and K. Hinckley, “InChorus: Designing consistent multimodal interactions for data visualization on tablet devices,” in Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. ACM, apr 2020. [Online]. Available: https://doi.org/10.1145%2F3313831.3376782 [56] W. Cui, X. Zhang, Y. Wang, H. Huang, B. Chen, L. Fang, H. Zhang, J.-G. Lou, and D. Zhang, “Text-to-viz: Automatic generation of infographics from proportion-related natural language statements,” IEEE Transactions on Visualization and Computer Graphics, vol. 26, no. 1, pp. 906–916, jan 2020. [Online]. Available: https://doi.org/10.1109%2Ftvcg.2019.2934785 [57] N. Ranjan, K. Mundada, K. Phaltane, and S. Ahmad, “A survey on techniques in NLP,” International Journal of Computer Applications, vol. 134, no. 8, pp. 6–9, jan 2016. [Online]. Available: https: //doi.org/10.5120%2Fijca2016907355 References 56 [58] Y. Kang, Z. Cai, C.-W. Tan, Q. Huang, and H. Liu, “Natural language processing (NLP) in management research: A literature review,” Journal of Management Analytics, vol. 7, no. 2, pp. 139–172, apr 2020. [Online]. Available: https://doi.org/10.1080%2F23270012.2020.1756939 [59] E. Loper and S. Bird, “NLTK: The Natural Language Toolkit,” in Proceedings of the ACL-02 Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics. Philadelphia, Pennsylvania, USA: Association for Computational Linguistics, Jul. 2002, pp. 63–70. [Online]. Available: https://aclanthology.org/W02-0109 [60] “Corenlp overview.” [Online]. Available: https://stanfordnlp.github.io/ CoreNLP/ [61] “Apache opennlp overview.” [Online]. Available: https://opennlp.apache.org/ [62] “Allennlp - allen institute for ai.” [Online]. Available: https://allenai.org/ allennlp [63] “Gensim: Topic modelling for humans,” May 2022. [Online]. Available: https://radimrehurek.com/gensim/ [64] “Textblob: Simplified text processing.” [Online]. Available: https: //textblob.readthedocs.io/en/dev/ [65] “Nlp architect: Simplified text processing.” [Online]. Available: https: //intellabs.github.io/nlp-architect/# [66] “Googlenlp.” [Online]. Available: https://github.com/BrianWeinstein/ googlenlp [67] flairNLP, “Flairnlp/flair: A very simple framework for state-of-theart natural language processing (nlp).” [Online]. Available: https: //github.com/flairNLP/flair [68] “Stanza: Overview.” [Online]. Available: https://stanfordnlp.github.io/ stanza/ [69] B. Yu and C. T. Silva, “Visflow-web-based visualization framework for tabular data with a subset flow model,” IEEE transactions on visualization and computer graphics, vol. 23, no. 1, pp. 251–260, 2016. [70] L. Shen, E. Shen, Y. Luo, X. Yang, X. Hu, X. Zhang, Z. Tai, and J. Wang, “Towards Natural Language Interfaces for Data Visualization: A Survey,” References 57 IEEE Transactions on Visualization and Computer Graphics, vol. XX, no. X, pp. 1–20, 2022. [71] J. Wei and K. Zou, “EDA: Easy data augmentation techniques for boosting performance on text classification tasks,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, 2019. [Online]. Available: https://doi.org/10.18653%2Fv1%2Fd19-1670 [72] J. Morris, E. Lifland, J. Y. Yoo, J. Grigsby, D. Jin, and Y. Qi, “TextAttack: A framework for adversarial attacks, data augmentation, and adversarial training in NLP,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Association for Computational Linguistics, 2020. [Online]. Available: https://doi.org/10.18653%2Fv1%2F2020.emnlp-demos.16 [73] S. Wang and J. Jiang, “Learning natural language inference with lstm,” arXiv preprint arXiv:1512.08849, 2015.	en_US
dc.identifier.uri	http://hdl.handle.net/123456789/1850
dc.description	Supervised by Dr. Abu Raihan Mostafa Kamal, Professor, Department of Computer Science and Engineering, Islamic University of Technology. Board Bazar, Gazipur-1704. Bangladesh. This thesis submitted in partial fulfilment of the requirements for the degree of M.Sc. in Computer Science and Engineerin	en_US
dc.description.abstract	Nowadays, visual interactive systems (Vis) are attracting more attention in research and industries because of their effectiveness in conveying information. Additionally, to make rational decisions based on extracted data, Vis is critical for identifying and comprehending trends, outliers, and patterns in data. Existing research has employed a broad range of methodologies to yield visualization insights into certain decision-making systems, allowing participants to perceive a specific problem from a wide range of viewpoints. However, there are still enough scopes to design a new Vis especially using visualization-oriented natural language interface (V-NLI) where state-of-the-art NLP techniques are utilized to visualize the data according to the user’s NL queries. Furthermore, in several real-life decisionmaking scenarios, this DV tools are required with proper explanations to build trust on the predictions of the model. In this regard, we propose a framework for explainable V-NLI based data visualization system. Therefore, (i) we developed a deep learning-based NLP framework to extract key information to generate proper visualization type (viz-type) on given user query. (ii) Next, we extend our prior model to an explainable visualization model that not only accurately visualizes the desired data but also explains why it appears depending on the given natural language query (NLQ).	en_US
dc.language.iso	en	en_US
dc.publisher	Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur, Bangladesh	en_US
dc.subject	Data visualization; V-NLI; LSTM; XAI, LIME	en_US
dc.title	Development Of An Explainable Natural Language Query Driven Data Visualization System	en_US
dc.type	Thesis	en_US