• 제목/요약/키워드: Similarity measures

검색결과 304건 처리시간 0.02초

A New Collaborative Filtering Method for Movie Recommendation Using Genre Interest (영화 추천을 위한 장르 흥미도를 이용한 새로운 협력 필터링 방식)

  • Lee, Soojung
    • Journal of Digital Convergence
    • /
    • 제12권8호
    • /
    • pp.329-335
    • /
    • 2014
  • Collaborative filtering has been popular in commercial recommender systems, as it successfully implements social behavior of customers by suggesting items that might fit to the interests of a user. So far, most common method to find proper items for recommendation is by searching for similar users and consulting their ratings. This paper suggests a new similarity measure for movie recommendation that is based on genre interest, instead of differences between ratings made by two users as in previous similarity measures. From extensive experiments, the proposed measure is proved to perform significantly better than classic similarity measures in terms of both prediction and recommendation qualities.

Design and Implementation of Computer Engineering Technical Interview Support System (컴퓨터 공학 기술 면접 지원 시스템의 설계 및 구현)

  • Dong-Hyun Lee;Seung-Min Park;Dong-Hyun Kim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • 제19권3호
    • /
    • pp.603-608
    • /
    • 2024
  • Recently, the frequency of computer engineering and technology interviews has increased in the process of hiring developers, and accordingly, the burden of technical interviews among interviewees has also increased. However, during computer engineering technical interview practice, it is difficult to judge whether one's answers are correct, and to measure the appropriate vocalization speed by oneself. In this paper, we propose a computer engineering technical interview support system using similarity measurement technology. The proposed system measures the technical accuracy of the interviewee's answers through a sentence similarity evaluation procedure using cosine similarity to measure the technical accuracy of the interviewee's answers. It also measures the speech rate and provides it to the interviewee.

Some new similarity based approaches in approximate reasoning and their applications to pattern recognition

  • Swapan Raha;Nikhil R. Pal;Ray, Kumar-Sankar
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 한국퍼지및지능시스템학회 1998년도 The Third Asian Fuzzy Systems Symposium
    • /
    • pp.719-724
    • /
    • 1998
  • This paper presents a systematic developement of a formal approach to inference in approximate reasoning. We introduce some measures of similarity and discuss their properties. Using the concept of similarity index we formulate two methods for inferring from vague knowledge. In order to illustrate the effectiveness of the proposed technique we use it to develop a vowel recognition system.

  • PDF

Similarity Measure Construction of the Fuzzy Set for the Reliable Data Selection (신뢰성 있는 정보의 추출을 위한 퍼지집합의 유사측도 구성)

  • Lee Sang-Hyuk
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • 제30권9C호
    • /
    • pp.854-859
    • /
    • 2005
  • We construct the fuzzy entropy for measuring of uncertainty with the help of relation between distance measure and similarity measure. Proposed fuzzy entropy is constructed through distance measure. In this study, the distance measure is used Hamming distance measure. Also for the measure of similarity between fuzzy sets or crisp sets, we construct similarity measure through distance measure, and the proposed 려zzy entropies and similarity measures are proved.

A New Similarity Measure for Categorical Attribute-Based Clustering (범주형 속성 기반 군집화를 위한 새로운 유사 측도)

  • Kim, Min;Jeon, Joo-Hyuk;Woo, Kyung-Gu;Kim, Myoung-Ho
    • Journal of KIISE:Databases
    • /
    • 제37권2호
    • /
    • pp.71-81
    • /
    • 2010
  • The problem of finding clusters is widely used in numerous applications, such as pattern recognition, image analysis, market analysis. The important factors that decide cluster quality are the similarity measure and the number of attributes. Similarity measures should be defined with respect to the data types. Existing similarity measures are well applicable to numerical attribute values. However, those measures do not work well when the data is described by categorical attributes, that is, when no inherent similarity measure between values. In high dimensional spaces, conventional clustering algorithms tend to break down because of sparsity of data points. To overcome this difficulty, a subspace clustering approach has been proposed. It is based on the observation that different clusters may exist in different subspaces. In this paper, we propose a new similarity measure for clustering of high dimensional categorical data. The measure is defined based on the fact that a good clustering is one where each cluster should have certain information that can distinguish it with other clusters. We also try to capture on the attribute dependencies. This study is meaningful because there has been no method to use both of them. Experimental results on real datasets show clusters obtained by our proposed similarity measure are good enough with respect to clustering accuracy.

A Study on the Performance of Similarity Indices and its Relationship with Link Prediction: a Two-State Random Network Case

  • Ahn, Min-Woo;Jung, Woo-Sung
    • Journal of the Korean Physical Society
    • /
    • 제73권10호
    • /
    • pp.1589-1595
    • /
    • 2018
  • Similarity index measures the topological proximity of node pairs in a complex network. Numerous similarity indices have been defined and investigated, but the dependency of structure on the performance of similarity indices has not been sufficiently investigated. In this study, we investigated the relationship between the performance of similarity indices and structural properties of a network by employing a two-state random network. A node in a two-state network has binary types that are initially given, and a connection probability is determined from the state of the node pair. The performances of similarity indices are affected by the number of links and the ratio of intra-connections to inter-connections. Similarity indices have different characteristics depending on their type. Local indices perform well in small-size networks and do not depend on whether the structure is intra-dominant or inter-dominant. In contrast, global indices perform better in large-size networks, and some such indices do not perform well in an inter-dominant structure. We also found that link prediction performance and the performance of similarity are correlated in both model networks and empirical networks. This relationship implies that link prediction performance can be used as an approximation for the performance of the similarity index when information about node type is unavailable. This relationship may help to find the appropriate index for given networks.

The application for predictive similarity measures of binary data in association rule mining (이분형 예측 유사성 측도의 연관성 평가 기준 적용 방안)

  • Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • 제22권3호
    • /
    • pp.495-503
    • /
    • 2011
  • The most widely used data mining technique is to find association rules. Association rule mining is the method to quantify the relationship between each set of items in very huge database based on the association thresholds. There are some basic association thresholds to explore meaningful association rules ; support, confidence, lift, etc. Among them, confidence is the most frequently used, but it has the drawback that it can not determine the direction of the association. The net confidence and the attributably pure confidence were developed to compensate for this drawback, but they have other drawbacks.In this paper we consider some predictive similarity measures for binary data in cluster analysis and multi-dimensional analysis as association threshold to compensate for these drawbacks. The comparative studies with net confidence, attributably pure confidence, and some predictive similarity measures are shown by numerical example.

Semantic Similarity Measures Between Words within a Document using WordNet (워드넷을 이용한 문서내에서 단어 사이의 의미적 유사도 측정)

  • Kang, SeokHoon;Park, JongMin
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • 제16권11호
    • /
    • pp.7718-7728
    • /
    • 2015
  • Semantic similarity between words can be applied in many fields including computational linguistics, artificial intelligence, and information retrieval. In this paper, we present weighted method for measuring a semantic similarity between words in a document. This method uses edge distance and depth of WordNet. The method calculates a semantic similarity between words on the basis of document information. Document information uses word term frequencies(TF) and word concept frequencies(CF). Each word weight value is calculated by TF and CF in the document. The method includes the edge distance between words, the depth of subsumer, and the word weight in the document. We compared out scheme with the other method by experiments. As the result, the proposed method outperforms other similarity measures. In the document, the word weight value is calculated by the proposed method. Other methods which based simple shortest distance or depth had difficult to represent the information or merge informations. This paper considered shortest distance, depth and information of words in the document, and also improved the performance.

Part Similarity Assessment Method Based on Hierarchical Feature Decomposition: Part 2 - Using Negative Feature Decomposition (계층적 특징형상 정보에 기반한 부품 유사성 평가 방법: Part 2 - 절삭가공 특징형상 분할방식 이용)

  • 김용세;강병구;정용희
    • Korean Journal of Computational Design and Engineering
    • /
    • 제9권1호
    • /
    • pp.51-61
    • /
    • 2004
  • Mechanical parts are often grouped into part families based on the similarity of their shapes, to support efficient manufacturing process planning and design modification. The 2-part sequence papers present similarity assessment techniques to support part family classification for machined parts. These exploit the multiple feature decompositions obtained by the feature recognition method using convex decomposition. Convex decomposition provides a hierarchical volumetric representation of a part, organized in an outside-in hierarchy. It provides local accessibility directions, which supports abstract and qualitative similarity assessment. It is converted to a Form Feature Decomposition (FFD), which represents a part using form features intrinsic to the shape of the part. This supports abstract and qualitative similarity assessment using positive feature volumes.. FFD is converted to Negative Feature Decomposition (NFD), which represents a part as a base component and negative machining features. This supports a detailed, quantitative similarity assessment technique that measures the similarity between machined parts and associated machining processes implied by two parts' NFDs. Features of the NFD are organized into branch groups to capture the NFD hierarchy and feature interrelations. Branch groups of two parts' NFDs are matched to obtain pairs, and then features within each pair of branch groups are compared, exploiting feature type, size, machining direction, and other information relevant to machining processes. This paper, the second one of the two companion papers, describes the similarity assessment method using NFD.

Part Similarity Assessment Method Based on Hierarchical Feature Decomposition: Part 1 - Using Convex Decomposition and Form Feature Decomposition (계층적 특징형상 정보에 기반한 부품 유사성 평가 방법: Part 1 - 볼록입체 분할방식 및 특징형상 분할방식 이용)

  • 김용세;강병구;정용희
    • Korean Journal of Computational Design and Engineering
    • /
    • 제9권1호
    • /
    • pp.44-50
    • /
    • 2004
  • Mechanical parts are often grouped into part families based on the similarity of their shapes, to support efficient manufacturing process planning and design modification. The 2-part sequence papers present similarity assessment techniques to support part family classification for machined parts. These exploit the multiple feature decompositions obtained by the feature recognition method using convex decomposition. Convex decomposition provides a hierarchical volumetric representation of a part, organized in an outside-in hierarchy. It provides local accessibility directions, which supports abstract and qualitative similarity assessment. It is converted to a Form Feature Decomposition (FFD), which represents a part using form features intrinsic to the shape of the part. This supports abstract and qualitative similarity assessment using positive feature volumes. FFD is converted to Negative Feature Decomposition (NFD), which represents a part as a base component and negative machining features. This supports a detailed, quantitative similarity assessment technique that measures the similarity between machined parts and associated machining processes implied by two parts' NFDs. Features of the NFD are organized into branch groups to capture the NFD hierarchy and feature interrelations. Branch groups of two parts' NFDs are matched to obtain pairs, and then features within each pair of branch groups are compared, exploiting feature type, size, machining direction, and other information relevant to machining processes. This paper, the first one of the two companion papers, describes the similarity assessment methods using convex decomposition and FFD.