• Title/Summary/Keyword: similarity-based

Search Result 3,619, Processing Time 0.038 seconds

A New Similarity Measure for Categorical Attribute-Based Clustering (범주형 속성 기반 군집화를 위한 새로운 유사 측도)

  • Kim, Min;Jeon, Joo-Hyuk;Woo, Kyung-Gu;Kim, Myoung-Ho
    • Journal of KIISE:Databases
    • /
    • v.37 no.2
    • /
    • pp.71-81
    • /
    • 2010
  • The problem of finding clusters is widely used in numerous applications, such as pattern recognition, image analysis, market analysis. The important factors that decide cluster quality are the similarity measure and the number of attributes. Similarity measures should be defined with respect to the data types. Existing similarity measures are well applicable to numerical attribute values. However, those measures do not work well when the data is described by categorical attributes, that is, when no inherent similarity measure between values. In high dimensional spaces, conventional clustering algorithms tend to break down because of sparsity of data points. To overcome this difficulty, a subspace clustering approach has been proposed. It is based on the observation that different clusters may exist in different subspaces. In this paper, we propose a new similarity measure for clustering of high dimensional categorical data. The measure is defined based on the fact that a good clustering is one where each cluster should have certain information that can distinguish it with other clusters. We also try to capture on the attribute dependencies. This study is meaningful because there has been no method to use both of them. Experimental results on real datasets show clusters obtained by our proposed similarity measure are good enough with respect to clustering accuracy.

Similarity Measurement of 3D Shapes Using Ray Distances (Ray distance를 이용한 3차원 형상의 유사성 판단)

  • 황태진;정지훈;오헌영;이건우
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.21 no.1
    • /
    • pp.159-166
    • /
    • 2004
  • Custom-tailored products are meant by the products having various sizes and shapes to meet the customer's different tastes or needs. Thus fabrication of custom-tailored products inherently involves inefficiency. To minimize this inefficiency, a new paradigm is proposed in this work. In this paradigm, different parts are grouped together according to their sizes and shapes. Then, representative shape of each group is derived and it will be used as the work-piece from which the parts in the group are machined. Once a new product is ordered, the optimal work-piece is selected through making similarity comparisons of new product and each representative shape. Then an effective NC tool-path is generated to machine only the different portions between the work-piece and the ordered product. The efficient machining conditions are also derived from this shape difference. By machining only the different portions between the work-piece and the ordered product, it saves time. Similarity comparison starts with the determination of the closest pose between two shapes in consideration. The closest pose is derived by comparing the ray distances while one shape is virtually rotated with respect to the other. Shape similarity value and overall similarity value calculated from ray distances are used for grouping. A prototype system based on the proposed methodology has been implemented and applied to the grouping and machining of the shoe lasts of various shapes and sizes.

An Efficient Image Matching Scheme Based on Min-Max Similarity for Distorted Images (왜곡 영상을 위한 효과적인 최소-최대 유사도(Min-Max Similarity) 기반의 영상 정합 알고리즘)

  • Heo, Young-Jin;Jeong, Da-Mi;Kim, Byung-Gyu
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.12
    • /
    • pp.1404-1414
    • /
    • 2019
  • Educational books commonly use some copyrighted images with various kinds of deformation for helping students understanding. When using several copyrighted images made by merging or editing distortion in legal, we need to pay a charge to original copyright holders for each image. In this paper, we propose an efficient matching algorithm by separating each copyrighted image with the merged and edited type including rotation, illumination change, and change of size. We use the Oriented FAST and Rotated BRIEF (ORB) method as a basic feature matching scheme. To improve the matching accuracy, we design a new MIN-MAX similarity in matching stage. With the distorted dataset, the proposed method shows up-to 97% of precision in experiments. Also, we demonstrate that the proposed similarity measure also outperforms compared to other measure which is commonly used.

Efficient Similarity Analysis Methods for Same Open Source Functions in Different Versions (서로 다른 버전의 동일 오픈소스 함수 간 효율적인 유사도 분석 기법)

  • Kim, Yeongcheol;Cho, Eun-Sun
    • Journal of KIISE
    • /
    • v.44 no.10
    • /
    • pp.1019-1025
    • /
    • 2017
  • Binary similarity analysis is used in vulnerability analysis, malicious code analysis, and plagiarism detection. Proving that a function is equal to a well-known safe functions of different versions through similarity analysis can help to improve the efficiency of the binary code analysis of malicious behavior as well as the efficiency of vulnerability analysis. However, few studies have been carried out on similarity analysis of the same function of different versions. In this paper, we analyze the similarity of function units through various methods based on extractable function information from binary code, and find a way to analyze efficiently with less time. In particular, we perform a comparative analysis of the different versions of the OpenSSL library to determine the way in which similar functions are detected even when the versions differ.

Similarity Measurement of 3D Shapes Using Ray Distances (Ray distance를 이용한 3차원 형상의 유사성 판단)

  • 정지훈;황태진;오헌영;이건우
    • Proceedings of the Korean Society of Precision Engineering Conference
    • /
    • 2003.06a
    • /
    • pp.70-73
    • /
    • 2003
  • Custom-tailored products are meant by the products having various sizes and shapes to meet the customer's different tastes or needs. Thus fabrication of custom-tailored products inherently involves inefficiency. To minimize this inefficiency, a new paradigm is proposed in this work. In this paradigm. different paris are grouped together according to their sizes and shapes. Then, representative shape of each group is derived and it will be used as the work-piece from which the parts in the group are machined. Once a new product is ordered, the optimal work-piece is selected through making similarity comparisons of new product and each representative shape. Then an effective NC tool-path is generated to machine only the different portions between the work-piece and the ordered product. The efficient machining conditions are also derived from this shape difference. By machining only the different portions between the work-piece and the ordered product, it saves time. Similarity comparison starts with the determination of the closest pose between two shapes in consideration. The closest pose is derived by comparing the ray distances while one shape is virtually rotated with respect to the other. Shape similarity value and overall similarity value calculated from ray distances are used for grouping. A prototype system based on the proposed methodology has been implemented and applied to the grouping and machining of the shoe lasts of various shapes and sizes.

  • PDF

A Weighted Preliminary Cut-off Indoor Positioning Scheme Based on Similarity between Peaks of RSSI (최대 RSSI 간의 유사도를 기반으로 한 가중치 부여 사전 컷-오프 실내 위치 추정 방식)

  • Kim, Dongjun;Son, Jooyoung
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.7
    • /
    • pp.772-778
    • /
    • 2018
  • We have previously proposed a preliminary cut-off indoor positioning scheme considering the reference point with the same signal similarity. This scheme estimates the position using the relative rank of the peak of received signal strength from the beacons around user. However, this scheme has a weak point with lower accuracy when there are more than one nearest reference points having the same signal similarity. In order to tackle this, we propose a weighted preliminary cut-off indoor positioning scheme. Firstly, if the above problem occurs, the similarity to the peak of signal strength is considered as well as the relative rank. Next, weights are assigned to the nearest reference points using the similarity to the peak of the received signal strength. Finally, the user's position is estimated by applying the weights. As a result, the weighted preliminary cut-off scheme improves the positioning accuracy by about 7.9% compared to the previous scheme.

Study on Design of Air-water Two-phase Flow Centrifugal Pump Based on Similarity Law

  • Matsushita, Naoki;Furukawa, Akinori;Watanabe, Satoshi;Okuma, Kusuo
    • International Journal of Fluid Machinery and Systems
    • /
    • v.2 no.2
    • /
    • pp.127-135
    • /
    • 2009
  • A conventional centrifugal pump causes a drastic deterioration of air-water two-phase flow performances even at an air-water two-phase flow condition of inlet void fraction less than 10% in the range of relatively low water flow rate. Then we have developed a two-phase flow centrifugal pump which consists of a tandem arrangement of double rotating cascades and blades of outer cascade have higher outlet angle more than $90^{\circ}$. In design of the two-phase flow pump for various sized and operating conditions, similarity relations of geometric dimensions to hydraulic performances is very useful. The similarity relations of rotational speed, impeller diameter and blade height are investigated for the developed impeller in the present paper. As the results, the similarity law of rotational speed and impeller diameter is clarified experimentally even in two-phase flow condition. In addition, influences of blade height on air-water two-phase flow performances indicate a little difference from the similarity relations.

Numerical Objective Assessment Using Structural Similarity for Diffuse Optical Reconstructed Images (재구성된 광간섭단층 영상의 구조적 유사성을 이용한 수치 목표 평가)

  • Mudeng, Vicky;Choe, Se-woon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.10a
    • /
    • pp.658-660
    • /
    • 2021
  • The work within this study develops an algorithm based on the structural similarity index to assess numerically between reconstructed images with a reference image to separate the homogeneity and heterogeneity for diffuse optical tomography. Global geometry and region of interest assessment have been measured to yield the similarity. The results indicate that the mean of structural similarity index shows potential performance to distinguish between visible and invisible inclusion inside the model. Therefore, the structural similarity index may promise to assist the image assessment for evaluating breast structural information.

  • PDF

An Index-Based Search Method for Performance Improvement of Set-Based Similar Sequence Matching (집합 유사 시퀀스 매칭의 성능 향상을 위한 인덱스 기반 검색 방법)

  • Lee, Juwon;Lim, Hyo-Sang
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.11
    • /
    • pp.507-520
    • /
    • 2017
  • The set-based similar sequence matching method measures similarity not for an individual data item but for a set grouping multiple data items. In the method, the similarity of two sets is represented as the size of intersection between them. However, there is a critical performances issue for the method in twofold: 1) calculating intersection size is a time consuming process, and 2) the number of set pairs that should be calculated the intersection size is quite large. In this paper, we propose an index-based search method for improving performance of set-based similar sequence matching in order to solve these performance issues. Our method consists of two parts. In the first part, we convert the set similarity problem into the intersection size comparison problem, and then, provide an index structure that accelerates the intersection size calculation. Second, we propose an efficient set-based similar sequence matching method which exploits the proposed index structure. Through experiments, we show that the proposed method reduces the execution time by 30 to 50 times then the existing methods. We also show that the proposed method has scalability since the performance gap becomes larger as the number of data sequences increases.

User and Item based Collaborative Filtering Using Classification Property Naive Bayesian (분류 속성과 Naive Bayesian을 이용한 사용자와 아이템 기반의 협력적 필터링)

  • Kim, Jong-Hun;Kim, Yong-Jip;Rim, Kee-Wook;Lee, Jung-Hyun;Chung, Kyung-Yong
    • The Journal of the Korea Contents Association
    • /
    • v.7 no.11
    • /
    • pp.23-33
    • /
    • 2007
  • The collaborative filtering has used the nearest neighborhood method based on the preference and the similarity using the Pearson correlation coefficient. Therefore, it does not reflect content of the items and has the problems of the sparsity and scalability as well. the item-based collaborative filtering has been practically used to improve these defects, but it still does not reflect attributes of the item. In this paper, we propose the user and item based collaborative filtering using the classification property and Naive Bayesian to supplement the defects in the existing recommendation system. The proposed method complexity refers to the item similarity based on explicit data and the user similarity based on implicit data for handing the sparse problem. It applies to the Naive Bayesian to the result of reference. Also, it can enhance the accuracy as computation of the item similarity reflects on the correlative rank among the classification property to reflect attributes.