[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.22937/IJCSNS.2022.22.6.44

KAB: Knowledge Augmented BERT2BERT Automated Questions-Answering system for Jurisprudential Legal Opinions

Alotaibi, Saud S. (Department of Information Systems, Umm Al-Qura University)
Munshi, Amr A. (Department of Information Systems, Umm Al-Qura University)
Farag, Abdullah Tarek (Capiter)
Rakha, Omar Essam (Faculty of Engineering, Ain Shams University)
Al Sallab, Ahmad A. (Faculty of Engineering, Cairo University)
Alotaibi, Majid (Department of Computer Engineering, Umm Al-Qura University)

Publication Information

International Journal of Computer Science & Network Security / v.22, no.6, 2022 , pp. 346-356 More about this Journal

Abstract

The jurisprudential legal rules govern the way Muslims react and interact to daily life. This creates a huge stream of questions, that require highly qualified and well-educated individuals, called Muftis. With Muslims representing almost 25% of the planet population, and the scarcity of qualified Muftis, this creates a demand supply problem calling for Automation solutions. This motivates the application of Artificial Intelligence (AI) to solve this problem, which requires a well-designed Question-Answering (QA) system to solve it. In this work, we propose a QA system, based on retrieval augmented generative transformer model for jurisprudential legal question. The main idea in the proposed architecture is the leverage of both state-of-the art transformer models, and the existing knowledge base of legal sources and question-answers. With the sensitivity of the domain in mind, due to its importance in Muslims daily lives, our design balances between exploitation of knowledge bases, and exploration provided by the generative transformer models. We collect a custom data set of 850,000 entries, that includes the question, answer, and category of the question. Our evaluation methodology is based on both quantitative and qualitative methods. We use metrics like BERTScore and METEOR to evaluate the precision and recall of the system. We also provide many qualitative results that show the quality of the generated answers, and how relevant they are to the asked questions.

Keywords

Islamic Fatwa; Natural Language Processing; Question Answering; Transformers;

Citations & Related Records

Reference

1	B. Hamoud and E. Atwell, "Quran question and answer corpus for data mining with WEKA," in 2016 Conference of Basic Sciences and Engineering Studies (SGCAC), 2016, pp. 211-216.
2	M. T. Sihotang, I. Jaya, A. Hizriadi, and S. M. Hardi, "Answering Islamic Questions with a Chatbot using Fuzzy String-Matching Algorithm," in Journal of Physics: Conference Series, 2020, vol. 1566, no. 1, p. 12007. DOI
3	P. Lewis et al., "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks," May 2020, Accessed: Jan. 31, 2022. [Online]. Available: http://arxiv.org/abs/2005.11401.
4	AiIftaSA, "alifta," https://www.alifta.gov.sa.
5	DarAlIftaEG, "Dar-al-ifta," https://www.daralifta.org/ar/Default.aspx?sec=fatwa&1&Home=1.
6	Islamway, "islamway," https://ar.islamway.net/fatawa/source/.
7	Islamweb, "islamweb," https://www.islamweb.net/ar/.
8	Islamonline, "islamonline," https://islamonline.net/.
9	A. Al-sallab, R. Baly, H. Hajj, K. B. Shaban, W. El-hajj, and G. Badaro, "AROMA : A Recursive Deep Learning Model for Opinion Mining in Arabic as a Low Resource Language," vol. 16, no. 4, 2017.
10	A. M. Abu Nada, E. Alajrami, A. A. Al-Saqqa, and S. S. Abu-Naser, "Arabic Text Summarization Using AraBERT Model Using Extractive Text Summarization Approach," 2020.
11	T. Naous, W. Antoun, R. A. Mahmoud, and H. Hajj, "Empathetic BERT2BERT Conversational Model: Learning Arabic Language Generation with Little Data," Mar. 2021, Accessed: Jan. 29, 2022. [Online]. Available: https://arxiv.org/abs/2103.04353.
12	W. Antoun, F. Baly, and H. Hajj, "Arabert: Transformer-based model for arabic language understanding," arXiv Prepr. arXiv2003.00104, 2020.
13	AlIftaJO, "alifta-jo," https://aliftaa.jo/.
14	AskFM98k, "askfm98k," https://omarito.me/arabic-askfmdataset/.
15	B. Athiwaratkun, A. G. Wilson, and A. Anandkumar, "Probabilistic fasttext for multi-sense word embeddings," arXiv Prepr. arXiv1806.02901, 2018.
16	Binbaz, "binbaz," https://binbaz.org.sa/fatwas/kind/1.
17	Binothaimeen, "binothaimeen," https://binothaimeen.net/site.
18	Islamqa, "islamqa," https://islamqa.info/.
19	S. Banerjee and A. Lavie, "METEOR: An automatic metric for MT evaluation with improved correlation with human judgments," in Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, 2005, pp. 65-72.
20	A. B. Soliman, K. Eissa, and S. R. El-Beltagy, "Aravec: A set of arabic word embedding models for use in arabic nlp," Procedia Comput. Sci., vol. 117, pp. 256-265, 2017. DOI
21	A. Vaswani et al., "Attention is all you need," arXiv Prepr. arXiv1706.03762, 2017.
22	J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "Bert: Pre-training of deep bidirectional transformers for language understanding," arXiv Prepr. arXiv1810.04805, 2018.
23	A. Abdi, S. Hasan, M. Arshi, S. M. Shamsuddin, and N. Idris, "A question answering system in hadith using linguistic knowledge," Comput. Speech \& Lang., vol. 60, p. 101023, 2020. DOI
24	M. E. Peters et al., "Deep contextualized word representations," arXiv Prepr. arXiv1802.05365, 2018.
25	W. Antoun, F. Baly, and H. Hajj, "AraBERT: Transformer-based Model for Arabic Language Understanding," Feb. 2020, Accessed: Jul. 05, 2021. [Online]. Available: http://arxiv.org/abs/2003.00104.
26	M. Djandji, F. Baly, H. Hajj, and others, "Multi-Task Learning using AraBert for Offensive Language Detection," in Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection, 2020, pp. 97-101.
27	J. Howard and S. Ruder, "Universal language model finetuning for text classification," arXiv Prepr. arXiv1801.06146, 2018.
28	C. Chen et al., "bert2BERT: Towards Reusable Pretrained Language Models," Oct. 2021, Accessed: Jan. 31, 2022. [Online]. Available: http://arxiv.org/abs/2110.07143.
29	T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger, and Y. Artzi, "BERTScore: Evaluating Text Generation with BERT," Apr. 2019, Accessed: Feb. 02, 2022. [Online]. Available: https://arxiv.org/abs/1904.09675.
30	D. Bahdanau, K. Cho, and Y. Bengio, "Neural machine translation by jointly learning to align and translate," arXiv Prepr. arXiv1409.0473, 2014.
31	M.-T. Luong, H. Pham, and C. D. Manning, "Effective Approaches to Attention-based Neural Machine Translation," Aug. 2015, Accessed: Aug. 09, 2018. [Online]. Available: http://arxiv.org/abs/1508.04025.