• Title/Summary/Keyword: Arabic

Search Result 234, Processing Time 0.026 seconds

Identifying Mobile Owner based on Authorship Attribution using WhatsApp Conversation

  • Almezaini, Badr Mohammd;Khan, Muhammad Asif
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.7
    • /
    • pp.317-323
    • /
    • 2021
  • Social media is increasingly becoming a part of our daily life for communicating each other. There are various tools and applications for communication and therefore, identity theft is a common issue among users of such application. A new style of identity theft occurs when cybercriminals break into WhatsApp account, pretend as real friends and demand money or blackmail emotionally. In order to prevent from such issues, data mining can be used for text classification (TC) in analysis authorship attribution (AA) to recognize original sender of the message. Arabic is one of the most spoken languages around the world with different variants. In this research, we built a machine learning model for mining and analyzing the Arabic messages to identify the author of the messages in Saudi dialect. Many points would be addressed regarding authorship attribution mining and analysis: collect Arabic messages in the Saudi dialect, filtration of the messages' tokens. The classification would use a cross-validation technique and different machine-learning algorithms (Naïve Baye, Support Vector Machine). Results of average accuracy for Naïve Baye and Support Vector Machine have been presented and suggestions for future work have been presented.

Sentiment Analysis of COVID-19 Tweets: Impact of Pre-processing Step

  • Ayadi, Rami;Shahin, Osama R.;Ghorbel, Osama;Alanazi, Rayan;Saidi, Anouar
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.3
    • /
    • pp.206-211
    • /
    • 2021
  • Internet users are increasingly invited to express their opinions on various subjects in social networks, e-commerce sites, news sites, forums, etc. Much of this information, which describes feelings, becomes the subject of study in several areas of research such as: "Sensing opinions and analyzing feelings". It is the process of identifying the polarity of the feelings held in the opinions found in the interactions of Internet users on the web and classifying them as positive, negative, or neutral. In this article, we suggest the implementation of a sentiment analysis tool that has the role of detecting the polarity of opinions from people about COVID-19 extracted from social media (tweeter) in the Arabic language and to know the impact of the pre-processing phase on the opinions classification. The results show gaps in this area of research, first of all, the lack of resources when collecting data. Second, Arabic language is more complexes in pre-processing step, especially the dialects in the pre-treatment phase. But ultimately the results obtained are promising.

Deep Learning Based Rumor Detection for Arabic Micro-Text

  • Alharbi, Shada;Alyoubi, Khaled;Alotaibi, Fahd
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.11
    • /
    • pp.73-80
    • /
    • 2021
  • Nowadays microblogs have become the most popular platforms to obtain and spread information. Twitter is one of the most used platforms to share everyday life event. However, rumors and misinformation on Arabic social media platforms has become pervasive which can create inestimable harm to society. Therefore, it is imperative to tackle and study this issue to distinguish the verified information from the unverified ones. There is an increasing interest in rumor detection on microblogs recently, however, it is mostly applied on English language while the work on Arabic language is still ongoing research topic and need more efforts. In this paper, we propose a combined Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) to detect rumors on Twitter dataset. Various experiments were conducted to choose the best hyper-parameters tuning to achieve the best results. Moreover, different neural network models are used to evaluate performance and compare results. Experiments show that the CNN-LSTM model achieved the best accuracy 0.95 and an F1-score of 0.94 which outperform the state-of-the-art methods.

Global History: Understanding Islamic Astronomy

  • LOHLKER, RUDIGER
    • Acta Via Serica
    • /
    • v.4 no.2
    • /
    • pp.97-118
    • /
    • 2019
  • This study presents a new conceptualization of the history of Islamic astronomy. Islamic history is an embedded global cultural phenomenon and will be analyzed at different levels: a) the history of institutional aspects (observatories, including buildings), b) instruments, c) manuscripts, and d) scholars. This phenomenon will be analyzed as a multi-lingual phenomenon with Arabic as the language of sciences as a starting point. Although this is not a study of a geographical region in a narrow sense, it is a historical note on the entanglement of research written in Arabic, Persian and other languages and contextualized in a framework reaching geographically far beyond the confines of the Islamic world and being part of global history.

The Use of MSVM and HMM for Sentence Alignment

  • Fattah, Mohamed Abdel
    • Journal of Information Processing Systems
    • /
    • v.8 no.2
    • /
    • pp.301-314
    • /
    • 2012
  • In this paper, two new approaches to align English-Arabic sentences in bilingual parallel corpora based on the Multi-Class Support Vector Machine (MSVM) and the Hidden Markov Model (HMM) classifiers are presented. A feature vector is extracted from the text pair that is under consideration. This vector contains text features such as length, punctuation score, and cognate score values. A set of manually prepared training data was assigned to train the Multi-Class Support Vector Machine and Hidden Markov Model. Another set of data was used for testing. The results of the MSVM and HMM outperform the results of the length based approach. Moreover these new approaches are valid for any language pairs and are quite flexible since the feature vector may contain less, more, or different features, such as a lexical matching feature and Hanzi characters in Japanese-Chinese texts, than the ones used in the current research.

Dissolution Characterstics of Indomethacin Microcapsules Prepared Using Gelatin-Gum Arabic Complex Coacervation (젤라틴-아리비아고무를 써서 製造한 인도메타신 마이크로캅셀의 용출 특성)

  • Ku, Young-Soon;Kim, Hwa-Yeon
    • YAKHAK HOEJI
    • /
    • v.28 no.4
    • /
    • pp.223-229
    • /
    • 1984
  • Microcapsules of indomethacin were prepared by the complex coacervation technique using gelatin-gum arabic as the wall-forming material. The effects of varying drug-to-matrix ratios and formalization time, and hydroxy propyl cellulose (HPC) added on the release of drug from microcapsules were studied. As the amount of wall-forming material increased, the drug content in the microcapsules decreased and the release of drug from microcapsules was retarded. The drug content was lower in the HPC added microcapsules than that in the microcapsules was retarded. The drug content was lower in the HPC added microcapsules than that in the microcapsules without HPC and the microcapsules with 1:4 drug-to-matrix ratio showed the slowest release. The release rate of the drug from microcapsules with 1:2 drug-to-matrix was delayed according to the increase of formalization time and the microcapsules formalized for 24hr showed ratio the most retardation.

  • PDF

Grapheme-to-Phoneme Conversion of Arabic Numeral Expressions for Embedded TTS Systems (임베디드 TTS 시스템을 위한 아라비안 숫자의 문자 변환)

  • Jung, Young-Im;Yoon, Ae-Sun;Kwon, Hyuk-Chul
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2005.07b
    • /
    • pp.442-444
    • /
    • 2005
  • 본 논문에서는 아라비안 숫자의 중의성을 효과적으로 제거하고 숫자 표현의 발음을 정확하게 문자화할 수 있는 임베디드 시스템용 경량화된 아라비안 숫자 읽기 시스템을 제안한다. 이를 위해 7 가지의 숫자 읽기 방식(Headings of Arabic Numerals RAN)을 분류하였고, 문자화 규칙을 설정하기 위해. (1) 문맥 자질, (2) 패턴 자질, (3) 휴리스틱 정보를 숫자 표현의 의미에 따라 분석하였다. 그리고 숫자의 문자화 시스템을 최적화하여 임베디드 시스템에 탑재하기 위해 (1) 형태소 분석 모듈의 분리, (2) 사전 압축, (3) 인명과 지명의 제거를 하였고, 이를 홍해 심각한 정확도 손실 없이 메모리 사용량과 처리 시간을 크게 줄일 수 있었다. 경량화된 mini-TAN 은 $96.9\~98.3\%$의 정확도를 보이며, 기존 상용 TTS 시스템에 비해서도 숫자 읽기의 처리에 있어 높은 정확도를 보인다.

  • PDF

A Style-based Approach to Translating Literary Texts from Arabic into English

  • Almanna, Ali
    • Cross-Cultural Studies
    • /
    • v.32
    • /
    • pp.5-28
    • /
    • 2013
  • In this paper, a style-based approach to translating literary texts is introduced and used. The aim of the study is to work out a stylistic approach to translating literary texts from Arabic into English. The approach proposed in the current study is a combination of four major stylistic approaches, namely linguistic stylistics, literary stylistics, affective stylistics and cognitive stylistics. It has been shown from data analysis that by adopting a style-based approach that can draw from the four stylistic approaches, translators, as special text readers, can easily derive a better understanding and appreciation of texts, in particular literary texts. Further, it has been shown that stylistics as an approach is objective in terms of drawing evidence from the text to support the argument for the important stylistic features and their functions. However, it loses some of its objectivity and becomes dependent and subjective.

Information Retrieval Systems: Between Morphological Analyzers and Systemming Algorithms

  • Mohamed, Afaf Abdel Rhman;Ouni, Chafika;Eljack, Sarah Mustafa;Alfayez, Fayez
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.3
    • /
    • pp.375-381
    • /
    • 2022
  • The main objective of an Information Retrieval System (IRS) is to obtain suitable information within a reasonable time to satisfy a user need. To achieve this purpose, an IRS should have a good indexing system that is based on natural language processing.In this context, we focus on the available Arabic language processing techniques for an IRS with the goal of contributing to an improvement in the performance. Our contribution consists of integrating morphological analysis into an IRS in order to compare the impact of morphological analysis with that of stemming algorithms.