[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.22937/IJCSNS.2021.21.9.37

Automatic Categorization of Islamic Jurisprudential Legal Questions using Hierarchical Deep Learning Text Classifier

AlSabban, Wesam H. (Department of Information Systems, Umm Al-Qura University)
Alotaibi, Saud S. (Department of Information Systems, Umm Al-Qura University)
Farag, Abdullah Tarek (Speakol)
Rakha, Omar Essam (Faculty of Engineering, Ain Shams University)
Al Sallab, Ahmad A. (Faculty of Engineering, Cairo University)
Alotaibi, Majid (Department of Computer Engineering, Umm Al-Qura University)

Publication Information

International Journal of Computer Science & Network Security / v.21, no.9, 2021 , pp. 281-291 More about this Journal

Abstract

The Islamic jurisprudential legal system represents an essential component of the Islamic religion, that governs many aspects of Muslims' daily lives. This creates many questions that require interpretations by qualified specialists, or Muftis according to the main sources of legislation in Islam. The Islamic jurisprudence is usually classified into branches, according to which the questions can be categorized and classified. Such categorization has many applications in automated question-answering systems, and in manual systems in routing the questions to a specialized Mufti to answer specific topics. In this work we tackle the problem of automatic categorisation of Islamic jurisprudential legal questions using deep learning techniques. In this paper, we build a hierarchical deep learning model that first extracts the question text features at two levels: word and sentence representation, followed by a text classifier that acts upon the question representation. To evaluate our model, we build and release the largest publicly available dataset of Islamic questions and answers, along with their topics, for 52 topic categories. We evaluate different state-of-the art deep learning models, both for word and sentence embeddings, comparing recurrent and transformer-based techniques, and performing extensive ablation studies to show the effect of each model choice. Our hierarchical model is based on pre-trained models, taking advantage of the recent advancement of transfer learning techniques, focused on Arabic language.

Keywords

Islamic Fatwa; Natural Language Processing; Text Classification; Question Answering; Recurrent Neural Networks; Transformers;

Citations & Related Records

Reference

1	M. Djandji, F. Baly, H. Hajj, and others, "Multi-Task Learning using AraBert for Offensive Language Detection," in Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection, 2020, pp. 97-101.
2	A. M. Abu Nada, E. Alajrami, A. A. Al-Saqqa, and S. S. Abu-Naser, "Arabic Text Summarization Using AraBERT Model Using Extractive Text Summarization Approach," 2020.
3	A. Al Sallab, M. Rashwan, H. Raafat, and A. Rafea, "Automatic Arabic diacritics restoration based on deep nets," in Proceedings of the EMNLP 2014 Workshop on Arabic Natural Language Processing (ANLP), 2014, pp. 65-72.
4	A. Magooda et al., "RDI-Team at SemEval-2016 task 3: RDI unsupervised framework for text ranking," 2016.
5	N. A. P. Rostam and N. H. A. H. Malim, "Text categorisation in Quran and Hadith: Overcoming the interrelation challenges using machine learning and term weighting," J. King Saud Univ. - Comput. Inf. Sci., vol. 33, no. 6, pp. 658-667, Jul. 2019, doi: 10.1016/j.jksuci.2019.03.007. DOI
6	M. E. Peters et al., "Deep contextualized word representations," arXiv Prepr. arXiv1802.05365, 2018.
7	D. Bahdanau, K. Cho, and Y. Bengio, "Neural machine translation by jointly learning to align and translate," arXiv Prepr. arXiv1409.0473, 2014.
8	J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, "Gated feedback recurrent neural networks," arXiv Prepr. arXiv1502.02367, 2015.
9	T. B. Brown et al., "Language models are few-shot learners," arXiv Prepr. arXiv2005.14165, 2020.
10	S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural Comput., vol. 9, no. 8, pp. 1735-1780, 1997. DOI
11	J. Howard and S. Ruder, "Universal language model fine-tuning for text classification," arXiv Prepr. arXiv1801.06146, 2018.
12	A. Al-sallab, R. Baly, H. Hajj, K. B. Shaban, W. Elhajj, and G. Badaro, "AROMA : A Recursive Deep Learning Model for Opinion Mining in Arabic as a Low Resource Language," vol. 16, no. 4, 2017.
13	W. Antoun, F. Baly, and H. Hajj, "Arabert: Transformer-based model for arabic language understanding," arXiv Prepr. arXiv2003.00104, 2020.
14	J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "Bert: Pre-training of deep bidirectional transformers for language understanding," arXiv Prepr. arXiv1810.04805, 2018.
15	A. Vaswani, "Attention Is All You Need," no. Nips, 2017.
16	A. B. Soliman, K. Eissa, and S. R. El-Beltagy, "Aravec: A set of arabic word embedding models for use in arabic nlp," Procedia Comput. Sci., vol. 117, pp. 256-265, 2017. DOI
17	B. Athiwaratkun, A. G. Wilson, and A. Anandkumar, "Probabilistic fasttext for multi-sense word embeddings," arXiv Prepr. arXiv1806.02901, 2018.
18	W. Antoun, F. Baly, and H. Hajj, "AraGPT2: Pre-Trained Transformer for Arabic Language Generation," Dec. 2020, Accessed: Jul. 05, 2021. [Online]. Available: http://arxiv.org/abs/2012.15520.