• Title/Summary/Keyword: Similarity Measurement Method

Search Result 177, Processing Time 0.03 seconds

Improved Collaborative Filtering Using Entropy Weighting

  • Kwon, Hyeong-Joon
    • International Journal of Advanced Culture Technology
    • /
    • v.1 no.2
    • /
    • pp.1-6
    • /
    • 2013
  • In this paper, we evaluate performance of existing similarity measurement metric and propose a novel method using user's preferences information entropy to reduce MAE in memory-based collaborative recommender systems. The proposed method applies a similarity of individual inclination to traditional similarity measurement methods. We experiment on various similarity metrics under different conditions, which include an amount of data and significance weighting from n/10 to n/60, to verify the proposed method. As a result, we confirm the proposed method is robust and efficient from the viewpoint of a sparse data set, applying existing various similarity measurement methods and Significance Weighting.

  • PDF

Spectral clustering based on the local similarity measure of shared neighbors

  • Cao, Zongqi;Chen, Hongjia;Wang, Xiang
    • ETRI Journal
    • /
    • v.44 no.5
    • /
    • pp.769-779
    • /
    • 2022
  • Spectral clustering has become a typical and efficient clustering method used in a variety of applications. The critical step of spectral clustering is the similarity measurement, which largely determines the performance of the spectral clustering method. In this paper, we propose a novel spectral clustering algorithm based on the local similarity measure of shared neighbors. This similarity measurement exploits the local density information between data points based on the weight of the shared neighbors in a directed k-nearest neighbor graph with only one parameter k, that is, the number of nearest neighbors. Numerical experiments on synthetic and real-world datasets demonstrate that our proposed algorithm outperforms other existing spectral clustering algorithms in terms of the clustering performance measured via the normalized mutual information, clustering accuracy, and F-measure. As an example, the proposed method can provide an improvement of 15.82% in the clustering performance for the Soybean dataset.

Sentence Similarity Measurement Method Using a Set-based POI Data Search (집합 기반 POI 검색을 이용한 문장 유사도 측정 기법)

  • Ko, EunByul;Lee, JongWoo
    • KIISE Transactions on Computing Practices
    • /
    • v.20 no.12
    • /
    • pp.711-716
    • /
    • 2014
  • With the gradual increase of interest in plagiarism and intelligent file content search, the demand for similarity measuring between two sentences is increasing. There is a lot of researches for sentence similarity measurement methods in various directions such as n-gram, edit-distance and LSA. However, these methods have their own advantages and disadvantages. In this paper, we propose a new sentence similarity measurement method approaching from another direction. The proposed method uses the set-based POI data search that improves search performance compared to the existing hard matching method when data includes the inverse, omission, insertion and revision of characters. Using this method, we are able to measure the similarity between two sentences more accurately and more quickly. We modified the data loading and text search algorithm of the set-based POI data search. We also added a word operation algorithm and a similarity measure between two sentences expressed as a percentage. From the experimental results, we observe that our sentence similarity measurement method shows better performance than n-gram and the set-based POI data search.

A Text Similarity Measurement Method Based on Singular Value Decomposition and Semantic Relevance

  • Li, Xu;Yao, Chunlong;Fan, Fenglong;Yu, Xiaoqiang
    • Journal of Information Processing Systems
    • /
    • v.13 no.4
    • /
    • pp.863-875
    • /
    • 2017
  • The traditional text similarity measurement methods based on word frequency vector ignore the semantic relationships between words, which has become the obstacle to text similarity calculation, together with the high-dimensionality and sparsity of document vector. To address the problems, the improved singular value decomposition is used to reduce dimensionality and remove noises of the text representation model. The optimal number of singular values is analyzed and the semantic relevance between words can be calculated in constructed semantic space. An inverted index construction algorithm and the similarity definitions between vectors are proposed to calculate the similarity between two documents on the semantic level. The experimental results on benchmark corpus demonstrate that the proposed method promotes the evaluation metrics of F-measure.

Transactions Clustering based on Item Similarity (아이템의 유사도를 고려한 트랜잭션 클러스터링)

  • 이상욱;김재련
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2002.11a
    • /
    • pp.250-257
    • /
    • 2002
  • Clustering is a data mining method, which consists in discovering interesting data distributions in very large databases. In traditional data clustering, similarity of a cluster of object is measured by pairwise similarity of objects in that paper. In view of the nature of clustering transactions, we devise in this paper a novel measurement called item similarity and utilize this to perform clustering. With this item similarity measurement, we develop an efficient clustering algorithm for target marketing in each group.

  • PDF

Similarity Measurement Using Open-Ball Scheme for 2D Patterns in Comparison with Moment Invariant Method (Open-Ball Scheme을 이용한 2D 패턴의 상대적 닮음 정도 측정의 Moment Invariant Method와의 비교)

  • Kim, Seong-Su
    • The Transactions of the Korean Institute of Electrical Engineers A
    • /
    • v.48 no.1
    • /
    • pp.76-81
    • /
    • 1999
  • The degree of relative similarity between 2D patterns is obtained using Open-Ball Scheme. Open-Ball Scheme employs a method of transforming the geometrical information on 3D objects or 2D patterns into the features to measure the relative similarity for object(patten) recognition, with invariance on scale, rotation, and translation. The feature of an object is used to obtain the relative similarity and mapped into [0, 1] the interval of real line. For decades, Moment-Invariant Method has been used as one of the excellent methods for pattern classification and object recognition. Open-Ball Scheme uses the geometrical structure of patterns while Moment Invariant Method uses the statistical characteristics. Open-Ball Scheme is compared to Moment Invariant Method with respect to the way that it interprets two-dimensional patten classification, especially the paradigms are compared by the degree of closeness to human's intuitive understanding. Finally the effectiveness of the proposed Open-Ball Scheme is illustrated through simulations.

  • PDF

An Image Segmentation Method and Similarity Measurement Using fuzzy Algorithm for Object Recognition (물체인식을 위한 영상분할 기법과 퍼지 알고리듬을 이용한 유사도 측정)

  • Kim, Dong-Gi;Lee, Seong-Gyu;Lee, Moon-Wook;Kang, E-Sok
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.28 no.2
    • /
    • pp.125-132
    • /
    • 2004
  • In this paper, we propose a new two-stage segmentation method for the effective object recognition which uses region-growing algorithm and k-means clustering method. At first, an image is segmented into many small regions via region growing algorithm. And then the segmented small regions are merged in several regions so that the regions of an object may be included in the same region using typical k-means clustering method. This paper also establishes similarity measurement which is useful for object recognition in an image. Similarity is measured by fuzzy system whose input variables are compactness, magnitude of biasness and orientation of biasness of the object image, which are geometrical features of the object. To verify the effectiveness of the proposed two-stage segmentation method and similarity measurement, experiments for object recognition were made and the results show that they are applicable to object recognition under normal circumstance as well as under abnormal circumstance of being.

Similarity Measurement of Part Specifications based on Ontology and ELECTRE IS (온톨로지와 ELECTRE IS을 활용한 사양 기반 부품 유사도 측정 방법)

  • Mun, Du-Hwan;Hwang, Ho-Jin
    • Korean Journal of Computational Design and Engineering
    • /
    • v.15 no.2
    • /
    • pp.144-156
    • /
    • 2010
  • When existing parts are re-used for the development of a new product or business-to-business transactions, a method for searching parts from a part database that meets user's requirements is necessary. To this end, it is important to develop a part search method which is able to measure similarity between parts and user's input data with generality as well as robustness. In this paper, the authors suggest a method for measuring part similarity using ontology and multi-criteria decision making method and address its technical details. The proposed method ensures the interoperability with existing engineering information management systems, represents part specifications systematically, and has generality in the procedure for comparing part specifications. The case study for ejector pins having been conducted to demonstrate the proposed method is also discussed.

Development of a Performance Evaluation Model on Similarity Measurement Method of Malware (악성코드 유사도 측정 기법의 성능 평가 모델 개발)

  • Chu, Sung-Taek;Kim, HeeSeok;Im, Kwang-Hyuk;Kim, Kyu-Il;Seo, Chang-Ho
    • The Journal of the Korea Contents Association
    • /
    • v.14 no.10
    • /
    • pp.32-40
    • /
    • 2014
  • While there is a great demand for malware classification to reduce the time required in malware analysis and find a new type of malware, various similarity measurement methods of malware to classify a lot of malwares have been proposed. But, the existing methods to measure similarity just represented the classification results by them and have not carried out performance comparison with other methods. This is because an evaluation model to compare the performance of similarity measurement methods is non-existent. In this paper, we propose a new performance evaluation model on similarity measurement methods of malware by using two indicators: success rate and degree of confidence. In addition, we compare and evaluate the performance of existing similarity measurement methods by using these two indicators.

A Similarity Measurement and Visualization Method for the Analysis of Program Code (프로그램 코드 분석을 위한 유사도 측정 및 가시화 기법)

  • Lee, Youngjoo;Lee, Jeongjin
    • Journal of Korea Multimedia Society
    • /
    • v.16 no.7
    • /
    • pp.802-809
    • /
    • 2013
  • In this paper, we propose the similarity measurement method between two program codes by counting the frequency and length of continuous patterns of specifiers and keywords, which exist in two program codes. In addition, we propose the visualization method of this analysis result by formal concept analysis. Proposed method considers adjacencies of specifiers or keywords, which have not been considered in the previous similarity measurements. Proposed method can detect the plagiarism by analyzing the pattern in each function regardless of the order of function call and execution. In addition, the result of the similarity measurement is visualized by the lattice of formal concept analysis to increase the user understanding about the relations between program codes. Experimental results showed that proposed method succeeded in 96% plagiarism detections. Our method could be applied into the analysis of general documents.