A Deep Neural Network-Based Information Retrieval Method for Complex Question-Answering Systems

Document Type : Original Article

Authors

1 PhD student, University of Science and Technology, Tehran, Iran

2 Associate Professor, University of Science and Technology, Tehran, Iran

3 Professor, Imam Hossein University, Tehran, Iran

Abstract

Question-answering systems, as the next generation of search engines, have the capability to retrieve relevant answers to queries posed in natural language. These systems generally consist of three main components: question processing, information retrieval, and answer extraction, with various methods introduced for each component. One of the most important aspects is information retrieval and selecting relevant paragraphs. Nowadays, most user queries in question-answering systems are complex. To respond to such queries, it initially requires understanding the question and then retrieving various documents that are lexically and semantically related to the query. In recent years, advancements in deep neural network-based learning on one hand and the introduction of high-quality datasets on a large scale on the other have attracted researchers' attention to this field. In this research, a method for selecting relevant paragraphs for complex question-answering systems on the HotpotQA dataset is introduced. To select relevant paragraphs, the type of question is first recognized using a deep neural network. Then, using BERT language models, relevant paragraphs that show evidence of the answer are selected in several steps. This process uses keyword extraction from the question. The results obtained indicate that the outcomes are better compared to the baseline method

Keywords


Smiley face

https://creativecommons.org/licenses/by/4.0/

[1]        M. A. Calijorne Soares and F. S. Parreiras, “A Literature Review on Question Answering Techniques, Paradigms and Systems,” J. King Saud Univ. - Comput. Inf. Sci., vol. 32, no. 6, pp. 635–646, 2020, doi: 10.1016/j.jksuci.2018.08.005.
[2]        P. Rajpurkar, R. Jia, and P. Liang, “Know what you don’t know: Unanswerable questions for SQuAD,” ACL 2018 - 56th Annu. Meet. Assoc. Comput. Linguist. Proc. Conf. (Long Pap., vol. 2, pp. 784–789, 2018, doi: 10.18653/v1/p18-2124.
[3]        Z. Yang et al., “HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering,” Proc. 2018 Conf. Empir. Methods Nat. Lang. Process. EMNLP 2018, pp. 2369–2380, 2018, doi: 10.18653/v1/d18-1259.
[4]        Y. Feldman and R. El-Yaniv, “Multi-hop Paragraph Retrieval for Open-domain Question Answering,” ACL 2019 - 57th Annu. Meet. Assoc. Comput. Linguist. Proc. Conf., pp. 2296–2309, 2020, doi: 10.18653/v1/p19-1222.
[5]        L. Qiu et al., “Dynamically fused graph network for multi-hop reasoning,” ACL 2019 - 57th Annu. Meet. Assoc. Comput. Linguist. Proc. Conf., pp. 6140–6150, 2020, doi: 10.18653/v1/p19-1617.
[6]        J. Devlin, M.-W. Chang, K. Lee, K. T. Google, and A. I. Language, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” Naacl-Hlt 2019, no. Mlm, 2018, [Online]. Available: https://github.com/tensorflow/tensor2tensor
[7]        A. Asai, K. Hashimoto, H. Hajishirzi, R. Socher, and C. Xiong, “Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering,” 2019, [Online]. Available: http://arxiv.org/abs/1911.10470
[8]        R. Nogueira and K. Cho, “Passage Re-ranking with BERT,” 2019, [Online]. Available: http://arxiv.org/abs/1901.04085
[9]        Y. Nie, S. Wang, and M. Bansal, “Revealing the importance of semantic retrieval for machine reading at scale,” EMNLP-IJCNLP 2019 - 2019 Conf. Empir. Methods Nat. Lang. Process. 9th Int. Jt. Conf. Nat. Lang. Process. Proc. Conf., pp. 2553–2566, 2019, doi: 10.18653/v1/d19-1258.
[10]      J. Ni, C. Zhu, W. Chen, and J. McAuley, “Learning to Attend On Essential Terms: An Enhanced Retriever-Reader Model for Open-domain Question Answering,” NAACL HLT 2019 - 2019 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. - Proc. Conf., vol. 1, pp. 335–344, 2019, doi: 10.18653/v1/n19-1030.
[11]      S. Hochreiter and J. Urgen Schmidhuber, “Long Shortterm Memory,” Neural Comput., vol. 9, no. 8, p. 17351780, 1997, [Online]. Available: http://www7.informatik.tu-muenchen.de/~hochreit%0Ahttp://www.idsia.ch/~juergen
[12]      G. Bebis and M. Georgiopoulos, “Feed-forward Neural Networks,” IEEE Potentials, vol. 13, no. 4, pp. 27–31, 2002, doi: 10.1109/45.329294.
[13]      M. F. Rabby, Y. Tu, M. I. Hossen, I. Lee, A. S. Maida, and X. Hei, “Stacked LSTM based deep recurrent neural network with kalman smoothing for blood glucose prediction,” BMC Med. Inform. Decis. Mak., vol. 21, no. 1, 2021, doi: 10.1186/s12911-021-01462-5.
[14]      M. Neumann, D. King, I. Beltagy, and W. Ammar, “ScispaCy: Fast and Robust Models for Biomedical Natural Language Processing,” BioNLP 2019 - SIGBioMed Work. Biomed. Nat. Lang. Process. Proc. 18th BioNLP Work. Shar. Task, pp. 319–327, 2019, doi: 10.18653/v1/w19-5034.
[15]      P. Qi, Y. Zhang, Y. Zhang, J. Bolton, and C. D. Manning, “Stanza: A Python Natural Language Processing Toolkit for Many Human Languages,” pp. 101–108, 2020, doi: 10.18653/v1/2020.acl-demos.14.
[16]      M. Grootendorst, “KeyBERT: Minimal Keyword Extraction with BERT,” Zenodo, 2020, [Online]. Available: https://github.com/MaartenGr/KeyBERT
[17]      Y. Fang, S. Sun, Z. Gan, R. Pillai, S. Wang, and J. Liu, “Hierarchical Graph Network for Multi-hop Question Answering,” EMNLP 2020 - 2020 Conf. Empir. Methods Nat. Lang. Process. Proc. Conf., pp. 8823–8838, 2020, doi: 10.18653/v1/2020.emnlp-main.710.
 
Volume 12, Issue 1 - Serial Number 45
No. 45, Spring 2024
June 2024
Pages 1-11
  • Receive Date: 01 March 2024
  • Revise Date: 14 April 2024
  • Accept Date: 09 May 2024
  • Publish Date: 02 June 2024