• Title/Summary/Keyword: sequence matcher

Search Result 2, Processing Time 0.017 seconds

Development of the Recommender System of Arabic Books Based on the Content Similarity

  • Alotaibi, Shaykhah Hajed;Khan, Muhammad Badruddin
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.8
    • /
    • pp.175-186
    • /
    • 2022
  • This research article develops an Arabic books' recommendation system, which is based on the content similarity that assists users to search for the right book and predict the appropriate and suitable books pertaining to their literary style. In fact, the system directs its users toward books, which can meet their needs from a large dataset of Information. Further, this system makes its predictions based on a set of data that is gathered from different books and converts it to vectors by using the TF-IDF system. After that, the recommendation algorithms such as the cosine similarity, the sequence matcher similarity, and the semantic similarity aggregate data to produce an efficient and effective recommendation. This approach is advantageous in recommending previously unrated books to users with unique interests. It is found to be proven from the obtained results that the results of the cosine similarity of the full content of books, the results of the sequence matcher similarity of Arabic titles of the books, and the results of the semantic similarity of English titles of the books are the best obtained results, and extremely close to the average of the result related to the human assigned/annotated similarity. Flask web application is developed with a simple interface to show the recommended Arabic books by using cosine similarity, sequence matcher similarity, and semantic similarity algorithms with all experiments that are conducted.

A Memory-Efficient Two-Stage String Matching Engine Using both Content-Addressable Memory and Bit-split String Matchers for Deep Packet Inspection (CAM과 비트 분리 문자열 매처를 이용한 DPI를 위한 2단의 문자열 매칭 엔진의 개발)

  • Kim, HyunJin;Choi, Kang-Il
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.39B no.7
    • /
    • pp.433-439
    • /
    • 2014
  • This paper proposes an architecture of two-stage string matching engine with content-addressable memory(CAM) and parallel bit-split string matchers for deep packet inspection(DPI). Each long signature is divided into subpatterns with the same length, where subpatterns are mapped onto the CAM in the first stage. The long pattern is matched in the second stage using the sequence of the matching indexes from the CAM. By adopting CAM and bit-split string matchers, the memory requirements can be greatly reduced in the heterogeneous string matching environments.