• Title/Summary/Keyword: content similarity

Search Result 532, Processing Time 0.024 seconds

A METHOD OF IMAGE DATA RETRIEVAL BASED ON SELF-ORGANIZING MAPS

  • Lee, Mal-Rey;Oh, Jong-Chul
    • Journal of applied mathematics & informatics
    • /
    • v.9 no.2
    • /
    • pp.793-806
    • /
    • 2002
  • Feature-based similarity retrieval become an important research issue in image database systems. The features of image data are useful to discrimination of images. In this paper, we propose the highspeed k-Nearest Neighbor search algorithm based on Self-Organizing Maps. Self-Organizing Maps (SOM) provides a mapping from high dimensional feature vectors onto a two-dimensional space. The mapping preserves the topology of the feature vectors. The map is called topological feature map. A topological feature map preserves the mutual relations (similarity) in feature spaces of input data. and clusters mutually similar feature vectors in a neighboring nodes. Each node of the topological feature map holds a node vector and similar images that is closest to each node vector. In topological feature map, there are empty nodes in which no image is classified. We experiment on the performance of our algorithm using color feature vectors extracted from images. Promising results have been obtained in experiments.

A Method for Measuring Similarity Measure of Thesaurus Transformation Documents using DBSCAN (DBSCAN을 활용한 유의어 변환 문서 유사도 측정 방법)

  • Kim, Byeongsik;Shin, Juhyun
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.9
    • /
    • pp.1035-1043
    • /
    • 2018
  • There is a case where the core content of another person's work is decorated as though it is his own thoughts by changing own thoughts without showing the source. Plagiarism test of copykiller free service used in plagiarism check is performed by comparing plagiarism more than 6th word. However, it is not enough to judge it as a plagiarism with a six - word match if it is replaced with a similar word. Therefore, in this paper, we construct word clusters by using DBSCAN algorithm, find synonyms, convert the words in the clusters into representative synonyms, and construct L-R tables through L-R parsing. We then propose a method for determining the similarity of documents by applying weights to the thesaurus and weights for each paragraph of the thesis.

A New Collaborative Filtering Method for Movie Recommendation Using Genre Interest (영화 추천을 위한 장르 흥미도를 이용한 새로운 협력 필터링 방식)

  • Lee, Soojung
    • Journal of Digital Convergence
    • /
    • v.12 no.8
    • /
    • pp.329-335
    • /
    • 2014
  • Collaborative filtering has been popular in commercial recommender systems, as it successfully implements social behavior of customers by suggesting items that might fit to the interests of a user. So far, most common method to find proper items for recommendation is by searching for similar users and consulting their ratings. This paper suggests a new similarity measure for movie recommendation that is based on genre interest, instead of differences between ratings made by two users as in previous similarity measures. From extensive experiments, the proposed measure is proved to perform significantly better than classic similarity measures in terms of both prediction and recommendation qualities.

A Re-Ranking Retrieval Model based on Two-Level Similarity Relation Matrices (2단계 유사관계 행렬을 기반으로 한 순위 재조정 검색 모델)

  • 이기영;은희주;김용성
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.11
    • /
    • pp.1519-1533
    • /
    • 2004
  • When Web-based special retrieval systems for scientific field extremely restrict the expression of user's information request, the process of the information content analysis and that of the information acquisition become inconsistent. In this paper, we apply the fuzzy retrieval model to solve the high time complexity of the retrieval system by constructing a reduced term set for the term's relatively importance degree. Furthermore, we perform a cluster retrieval to reflect the user's Query exactly through the similarity relation matrix satisfying the characteristics of the fuzzy compatibility relation. We have proven the performance of a proposed re-ranking model based on the similarity union of the fuzzy retrieval model and the document cluster retrieval model.

Similarity Measure and Clustering Technique for XML Documents by a Parent-Child Matrix (부모-자식 행렬을 사용한 XML 문서 유사도 측정과 군집 기법)

  • Lee, Yun-Gu;Kim, Woosaeng
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.19 no.7
    • /
    • pp.1599-1607
    • /
    • 2015
  • Recently, researches have been developing efficient techniques for accessing, querying, and managing XML documents which are frequently used in the Internet. In this paper, we propose a parent-child matrix to cluster XML documents efficiently. A parent-child matrix analyzes both the content and structural features of an XML document. Each cell of a parent-child matrix has either the value of a node in an XML tree or the value of a child node, where a parent-child relationship exists in the XML tree. Then, the similarity between two XML documents can be measured by the similarity between two corresponding parent-child matrices. The experiment shows that our proposed method has good performance.

A Comparative Analysis of Music Similarity Measures in Music Information Retrieval Systems

  • Gurjar, Kuldeep;Moon, Yang-Sae
    • Journal of Information Processing Systems
    • /
    • v.14 no.1
    • /
    • pp.32-55
    • /
    • 2018
  • The digitization of music has seen a considerable increase in audience size from a few localized listeners to a wider range of global listeners. At the same time, the digitization brings the challenge of smoothly retrieving music from large databases. To deal with this challenge, many systems which support the smooth retrieval of musical data have been developed. At the computational level, a query music piece is compared with the rest of the music pieces in the database. These systems, music information retrieval (MIR systems), work for various applications such as general music retrieval, plagiarism detection, music recommendation, and musicology. This paper mainly addresses two parts of the MIR research area. First, it presents a general overview of MIR, which will examine the history of MIR, the functionality of MIR, application areas of MIR, and the components of MIR. Second, we will investigate music similarity measurement methods, where we provide a comparative analysis of state of the art methods. The scope of this paper focuses on comparative analysis of the accuracy and efficiency of a few key MIR systems. These analyses help in understanding the current and future challenges associated with the field of MIR systems and music similarity measures.

Transitive Similarity Evaluation Model for Improving Sparsity in Collaborative Filtering (협업필터링의 희박 행렬 문제를 위한 이행적 유사도 평가 모델)

  • Bae, Eun-Young;Yu, Seok-Jong
    • The Journal of Korean Institute of Information Technology
    • /
    • v.16 no.12
    • /
    • pp.109-114
    • /
    • 2018
  • Collaborative filtering has been widely utilized in recommender systems as typical algorithm for outstanding performance. Since it depends on item rating history structurally, The more sparse rating matrix is, the lower its recommendation accuracy is, and sometimes it is totally useless. Variety of hybrid approaches have tried to combine collaborative filtering and content-based method for improving the sparsity issue in rating matrix. In this study, a new method is suggested for the same purpose, but with different perspective, it deals with no-match situation in person-person similarity evaluation. This method is called the transitive similarity model because it is based on relation graph of people, and it compares recommendation accuracy by applying to Movielens open dataset.

Genetic Relationahips of the Two Morphorogical Types of Myzus persicae(Homoptera:Aphididae) Collected from Tobacco Plants Based on Random Amplified Polymorphic DNA(RAPD) (연초에서 발생하는 복숭아혹진딧물(Myzus persicae)형태형 2종의 Random Amplified Polymorphic DNA(RAPD)을 이용한 유전적 유연관계 분석)

  • 채순용;이기원;김상석;장영덕
    • Korean journal of applied entomology
    • /
    • v.37 no.1
    • /
    • pp.31-37
    • /
    • 1998
  • Random amplified polymorphic DNA (RAPD) markers were used to analyze genetic similarity among 8 clones of apierous green peach aphid, two types (M. persicae Sulzer and M. nicotianae lack man) classified by their mo~hologi~cahla raters and host preference (Blackman, 1987), collected from tobacco plants. The genetic variation among these clones was evaluated by polymerase chain reaction amplification with 20 random primers. The higher GC contents of primers, the better in amplification efficiency of PCR reaction in general. The genetic similarities among eight aphid clones were analyzed from UPGMA (unweighted pair group average method) cluster analysis based on simple matching coefficient. The range of genetic similarity coefficients was 0.414 to 0.808. The most close relationship among the clones was similarity coefficient of 0.808 between the PG2 and the PG3 clone. The eight aphid clones analyzed were clustered into three groups by the genetic similarity coefficient. The first group, PG1, PG2, PG3 clone including in M. persicae type by their morphological characters and RED clone in M. nicotianae type was clustered at the genetic similarity coefficient of 0.643. The second group, GR1, GR2, BRN in M. nicotianae type was at the 0.636;and the third group was DBR clone in M. persicae type. The results did not indicate any correlation between m&-phological types (M. persicae and M. nicotianae) and RAPD polymorphism. We could not detect any obvious genetic relationships of the two morphological types of the green peach aphid collected from tobacco plants.

  • PDF

Semantic Similarity Measures Between Words within a Document using WordNet (워드넷을 이용한 문서내에서 단어 사이의 의미적 유사도 측정)

  • Kang, SeokHoon;Park, JongMin
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.16 no.11
    • /
    • pp.7718-7728
    • /
    • 2015
  • Semantic similarity between words can be applied in many fields including computational linguistics, artificial intelligence, and information retrieval. In this paper, we present weighted method for measuring a semantic similarity between words in a document. This method uses edge distance and depth of WordNet. The method calculates a semantic similarity between words on the basis of document information. Document information uses word term frequencies(TF) and word concept frequencies(CF). Each word weight value is calculated by TF and CF in the document. The method includes the edge distance between words, the depth of subsumer, and the word weight in the document. We compared out scheme with the other method by experiments. As the result, the proposed method outperforms other similarity measures. In the document, the word weight value is calculated by the proposed method. Other methods which based simple shortest distance or depth had difficult to represent the information or merge informations. This paper considered shortest distance, depth and information of words in the document, and also improved the performance.

Harmonic Mean Weight by Combining Content Based Filtering and Collaborative Filtering in a Recommender System (내용 기반 여과와 협력적 여과의 병합을 통한 추천 시스템에서 조화 평균 가중치)

  • 정경용;류중경;강운구;이정현
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.3_4
    • /
    • pp.239-250
    • /
    • 2003
  • Recent recommender system user a method of combining collaborative filtering system and content based filtering system in order to slove the problem of the Sparsity and First-Rater in collaborative filtering system. In this paper, to make up for the prediction accuracy in hybrid Recommender system, the harmonic mean weight(CBCF_harmonic_mean) is used for calculating the user similarity weight. After setting up the threshold as 45 considering the performance of content based filtering, we apply significance weight of n/45 to user similarity weight. To estimate the performance of the proposed method, it if compared with that of combing both the existing collaborative filtering system and the content- based filtering system. As a result, it confirms that the suggested method is efficient at improving the prediction accuracy as solving problems of the exiting collaborative filtering system.