• Title/Summary/Keyword: Arabic

Search Result 232, Processing Time 0.024 seconds

Application of Topic Modeling Techniques in Arabic Content: A Systematic Review

  • Maram Alhmiyani;Huda Alhazmi
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.6
    • /
    • pp.1-12
    • /
    • 2023
  • With the rapid increase of user generated data on digital platforms, the task of categorizing and classifying theses huge data has become difficult. Topic modeling is an unsupervised machine learning technique that can be used to get a summary from a large collection of documents. Topic modeling has been widely used in English content, yet the application of topic modeling in Arabic language is limited. Therefore, the aim of this paper is to provide a systematic review of the application of topic modeling algorithms in Arabic content. Using a well-known and trusted databases including ScienceDirect, IEEE Xplore, Springer Link, and Google Scholar. Considering the publication date from 2012 to 2022, we got 60 papers. After refining the papers based on predefined criteria, we resulted in 32 papers. Our result show that unfortunately the application of topic modeling techniques in Arabic content is limited.

Recurrent Neural Network with Backpropagation Through Time Learning Algorithm for Arabic Phoneme Recognition

  • Ismail, Saliza;Ahmad, Abdul Manan
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2004.08a
    • /
    • pp.1033-1036
    • /
    • 2004
  • The study on speech recognition and understanding has been done for many years. In this paper, we propose a new type of recurrent neural network architecture for speech recognition, in which each output unit is connected to itself and is also fully connected to other output units and all hidden units [1]. Besides that, we also proposed the new architecture and the learning algorithm of recurrent neural network such as Backpropagation Through Time (BPTT, which well-suited. The aim of the study was to observe the difference of Arabic's alphabet like "alif" until "ya". The purpose of this research is to upgrade the people's knowledge and understanding on Arabic's alphabet or word by using Recurrent Neural Network (RNN) and Backpropagation Through Time (BPTT) learning algorithm. 4 speakers (a mixture of male and female) are trained in quiet environment. Neural network is well-known as a technique that has the ability to classified nonlinear problem. Today, lots of researches have been done in applying Neural Network towards the solution of speech recognition [2] such as Arabic. The Arabic language offers a number of challenges for speech recognition [3]. Even through positive results have been obtained from the continuous study, research on minimizing the error rate is still gaining lots attention. This research utilizes Recurrent Neural Network, one of Neural Network technique to observe the difference of alphabet "alif" until "ya".

  • PDF

Arabic Text Clustering Methods and Suggested Solutions for Theme-Based Quran Clustering: Analysis of Literature

  • Bsoul, Qusay;Abdul Salam, Rosalina;Atwan, Jaffar;Jawarneh, Malik
    • Journal of Information Science Theory and Practice
    • /
    • v.9 no.4
    • /
    • pp.15-34
    • /
    • 2021
  • Text clustering is one of the most commonly used methods for detecting themes or types of documents. Text clustering is used in many fields, but its effectiveness is still not sufficient to be used for the understanding of Arabic text, especially with respect to terms extraction, unsupervised feature selection, and clustering algorithms. In most cases, terms extraction focuses on nouns. Clustering simplifies the understanding of an Arabic text like the text of the Quran; it is important not only for Muslims but for all people who want to know more about Islam. This paper discusses the complexity and limitations of Arabic text clustering in the Quran based on their themes. Unsupervised feature selection does not consider the relationships between the selected features. One weakness of clustering algorithms is that the selection of the optimal initial centroid still depends on chances and manual settings. Consequently, this paper reviews literature about the three major stages of Arabic clustering: terms extraction, unsupervised feature selection, and clustering. Six experiments were conducted to demonstrate previously un-discussed problems related to the metrics used for feature selection and clustering. Suggestions to improve clustering of the Quran based on themes are presented and discussed.

Building Hybrid Stop-Words Technique with Normalization for Pre-Processing Arabic Text

  • Atwan, Jaffar
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.7
    • /
    • pp.65-74
    • /
    • 2022
  • In natural language processing, commonly used words such as prepositions are referred to as stop-words; they have no inherent meaning and are therefore ignored in indexing and retrieval tasks. The removal of stop-words from Arabic text has a significant impact in terms of reducing the size of a cor- pus text, which leads to an improvement in the effectiveness and performance of Arabic-language processing systems. This study investigated the effectiveness of applying a stop-word lists elimination with normalization as a preprocessing step. The idea was to merge statistical method with the linguistic method to attain the best efficacy, and comparing the effects of this two-pronged approach in reducing corpus size for Ara- bic natural language processing systems. Three stop-word lists were considered: an Arabic Text Lookup Stop-list, Frequency- based Stop-list using Zipf's law, and Combined Stop-list. An experiment was conducted using a selected file from the Arabic Newswire data set. In the experiment, the size of the cor- pus was compared after removing the words contained in each list. The results showed that the best reduction in size was achieved by using the Combined Stop-list with normalization, with a word count reduction of 452930 and a compression rate of 30%.

The Impact of Language on Customer Intentions to Use Localized E-Commerce Websites in Arabic Countries: The Mediating Role of Perceived Risk and Trust

  • HERZALLAH, Fadi;AYYASH, Mohannad Moufeed;AHMAD, Kamsuriah
    • The Journal of Asian Finance, Economics and Business
    • /
    • v.9 no.1
    • /
    • pp.273-290
    • /
    • 2022
  • Localization of e-commerce websites is a useful tool for providing the world with business organizations and money-making enterprises. However, studies on e-Commerce website localization within the language domain are still quite limited. Thus, the study aims to investigate the relationship between the Arabic language and a wide range of e-Commerce website intentions, clarifying the indirect effects of the Arabic language on intentions to use e-Commerce websites using perceived risk and trust as mediating variables, and determining whether trust and perceived risk work as mediating variables between the Arabic language and e-Commerce website intentions. Survey data collated from participants totaling up to 264 has been used to test the research framework. The selection of these participants is based on their experiences employing e-Commerce websites. Structural equation modeling (SEM) through partial least square (PLS) software was used for the data analysis. The results show that the Arabic language, trust, and perceived risk play effective roles for e-Commerce websites adoption. More importantly, trust and perceived risk positively mediate the relationship between the Arabic language and intentions to use e-Commerce websites. Implications of the study's findings and suggestions for further research are discussed.

Characteristics of Spray Dried Polysaccharides for Microencapsulation (미세캡슐화를 위한 분무건조 다당류의 특성)

  • Lee, Seung-Cheol;Rhim, Chae-Hwan;Lee, Sang-Chun
    • Korean Journal of Food Science and Technology
    • /
    • v.29 no.6
    • /
    • pp.1322-1326
    • /
    • 1997
  • Characteristics of viscosity and spray dried particles for several polysaccharides were studied to investigate the possibilities as wall materials for microencapsulation. Viscosities of 10% maltodextrin, 10% gum arabic, 10% dextran, 1% gum locust bean, and 1% gum karaya were 2.2 mPa.s, 9.2 mPa.s, 13.0 mPa.s, 4660.0 mPa.s, and 77.0 mPa.s, respectively. In scanning electron micrographs for spray dried polysaccharides, gum arabic had spherical shapes at 20% and 30% emulsion concentration, while trailed shapes at 40%. Maltodextrin had uniform spherical shapes at 30%, while aggregated form with various kinds of capsule sizes at 40%. Dextran had spherical shapes at 20%, while trailed fibrous shapes at over 30%. Mixed polysaccharides with gum arabic:maltodextrin (1:3, w/w) had uniform spherical shapes at 20%, 30%, and 40% with increasing diameter with increasing concentration.

  • PDF

Validity and Reliability of the Fagerstrom Test for Cigarette Dependence in a Sample of Arabic Speaking UK-Resident Yemeni Khat Chewers

  • Kassim, Saba;Salam, Mohamed;Croucher, Ray
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.13 no.4
    • /
    • pp.1285-1288
    • /
    • 2012
  • Background: The Fagerstrom Test for Cigarette Dependence (FTCD) (formally FTND) is widely used for measuring physical dependence on nicotine. Objective: To explore the cross cultural validity and reliability of FTCD amongst Arabic speaker cigarette consumers who chew khat leaf, a stimulant green leaf. Methods: The psychometric properties of the FTCD were assessed in a subsample (91regular cigarette smokers) of purposively selected 204 UK-resident Yemeni khat chewers recruited during random visits to khat sale outlets. Data were collected via a structured face-to-face interview. Data analyses included descriptive tests and factor analysis. Results: Two factors were obtained by a principle axis factor analysis and these were termed as urgency of restoring the level of nicotine after abstinence during sleeping and maintaining the level of nicotine during waking. The internal reliability (Cronbach's alpha coefficient) of the whole FTCD is low (.68) as well as for the two subscales (.60) and (.62) respectively. Conclusion: The psychometric properties of the Arabic version of the FTCD scale in this sample of Yemeni khat chewers who smoked regularly confirmed what has been established in other cultural settings. The findings of this study have yet to be cross validated amongst other appropriately representative sample of Arabic speakers.

The Effect of Arabic Gum on the Copper Electrodeposition using Titanium Substrate (티타늄 기지을 이용한 구리 전해도금 시 Arabic Gum 첨가제의 영향)

  • Woo, Tae-Gyu;Park, Il-Song;Lee, Hyun-Woo;Seol, Kyeong-Won
    • Korean Journal of Materials Research
    • /
    • v.16 no.12
    • /
    • pp.725-730
    • /
    • 2006
  • The purpose of this study is to identify the effect of additives during copper electrodeposition. Additives such as arabic gum, chloride ions and glue were used in this study. Electrochemical experiments allied to SEM and roughness examination were performed to characterize of the copper foil in the presence of additives. In the production of electrodeposited copper foil, the surface roughness and grain size of the copper foil can be controlled by addition additives. on this study, the more uniform and hemispherical copper crystals are during the initial stages, the smaller crystal size and surface roughness of copper foil are. The surface roughness of copper foil electrodeposited at the current density of 500 $mA/cm^2$ under galvanostatic mode for 60 seconds has a minimum value of 0.136 ${\mu}$m when adding 2 ppm of arabic gum.

Survey of Automatic Query Expansion for Arabic Text Retrieval

  • Farhan, Yasir Hadi;Noah, Shahrul Azman Mohd;Mohd, Masnizah
    • Journal of Information Science Theory and Practice
    • /
    • v.8 no.4
    • /
    • pp.67-86
    • /
    • 2020
  • Information need has been one of the main motivations for a person using a search engine. Queries can represent very different information needs. Ironically, a query can be a poor representation of the information need because the user can find it difficult to express the information need. Query Expansion (QE) is being popularly used to address this limitation. While QE can be considered as a language-independent technique, recent findings have shown that in certain cases, language plays an important role. Arabic is a language with a particularly large vocabulary rich in words with synonymous shades of meaning and has high morphological complexity. This paper, therefore, provides a review on QE for Arabic information retrieval, the intention being to identify the recent state-of-the-art of this burgeoning area. In this review, we primarily discuss statistical QE approaches that include document analysis, search, browse log analyses, and web knowledge analyses, in addition to the semantic QE approaches, which use semantic knowledge structures to extract meaningful word relationships. Finally, our conclusion is that QE regarding the Arabic language is subjected to additional investigation and research due to the intricate nature of this language.

Writing Korean Numerals in Technical Writing (기술문에서 우리말 숫자 쓰기)

  • Kwon, Sung-Gyu
    • Journal of Engineering Education Research
    • /
    • v.14 no.2
    • /
    • pp.30-39
    • /
    • 2011
  • There is a problem that some spoken words are not consistent with the written words when Arabic numerals in Korean language are read. Since some rules for reading the Arabic numerals are not clear, the numerals should be read carefully by recognizing the position of the numbers in sentence, the relationship of the numerals with measurement nouns and other writing elements, and context. In view of technical writing, this work is to appreciate some rules for writing Korean numerals in place of Arabic numerals by studying the works regarding numerals, classifiers and measurement nouns.