• 제목/요약/키워드: Similarity Measurement

검색결과 352건 처리시간 0.027초

POI(Point Of Interest) 데이터 검색에서 문자열 유사도 측정 정확도 향상 기법 (Accuracy Improvement Methods for String Similarity Measurement in POI(Point Of Interest) Data Retrieval)

  • 고은별;이종우
    • 정보과학회 컴퓨팅의 실제 논문지
    • /
    • 제20권9호
    • /
    • pp.498-506
    • /
    • 2014
  • 교통의 발달로 활동범위가 넓은 현대인들은 네비게이션과 지도 앱을 통한 길찾기 검색을 자주 이용한다. 하지만 기존 검색 시스템에서는 부정확한 질의어가 입력되면 원하는 결과를 출력하지 못한다. 이 문제를 해결하기 위해 집합-기반 POI 검색 알고리즘이 등장했고 이어 문자열 유사도 측정 기법, 중복 글자를 고려한 검색 알고리즘이 연구되었다. 본 논문에서는 이전에 연구된 문자열 유사도 측정 알고리즘의 정확도를 향상시킨 기법을 제안한다. 기존 문자열 유사도 측정 기법에서 고려하지 않았던 고유어의 추정단계와 중복 단어를 고려한 블록 및 블록 나열 순서 구하기를 추가하고 측정 기법을 수식화한다. 이를 통해 측정방법을 체계적으로 표현하고 일반화함으로써 POI 검색 결과의 정확도를 향상시킨다. 실험을 통해 본 논문에서 제시하는 기법이 검색 결과 및 검색 순위의 정확도를 향상시킨다는 것을 확인하였다.

웹 트랜잭션 클러스터링의 정확성을 높이기 위한 흥미가중치 적용 유사도 비교방법 (Similarity Measurement with Interestingness Weight for Improving the Accuracy of Web Transaction Clustering)

  • 강태호;민영수;유재수
    • 정보처리학회논문지D
    • /
    • 제11D권3호
    • /
    • pp.717-730
    • /
    • 2004
  • 최근 들어 원 사이트 개인화(Web Personalization)에 관한 연구가 활발히 진행되고 있다. 웹 개인화는 클러스터링과 같은 데이터 마이닝 기법을 이용하여 가 사용자에게 가장 흥미를 가질만한 URL 집합을 예측하는 것이라 할 수 있다. 기존의 클러스터링을 이용한 방식에서는 웹 트랜잭션들을 웹 사이트의 각 URL들에 방문했는지 안했는지를 나타내는 비트 벡터(bit vector)로 표현하였다. 그리고 이들 비트 벡터의 방문 패턴이 일치하는 정도에 따라 유사성을 결정하였다. 하지간 이것은 유사한 성향을 가지는 웹 트랜잭션을 클러스터링 하는데 있어 사용자의 흥미를 배제하고 단순히 방문 여부만을 반영하게 되는 문제점이 발생하게 된다. 즉 방문 목적 또는 성향이 유사하지 않은 웹 트랜잭션들을 같은 그룹으로 분류할 가능성이 존재하게 된다 이에 본 논문에서는 기존의 비트 벡터를 이용한 트랜잭션 모델을 사용자의 흥미도(Interestingness)를 반영할 수 있도록 보완하여 새로운 점 트랜잭션 모델을 제시하고 흥미가중치를 적용한 유사도 비교방법을 제안한다. 그리고 성능평가를 통하여 제안만 방법이 기졸 방법에 비해 클러스터링의 정확성을 높임을 보인다.

순차적 클러스터링을 이용한 지역별 그룹핑 (Regional Grouping of the interconnected network system through Sequential Clustering)

  • 김현홍;송형용;김진호;박종배;신중린
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 2007년도 추계학술대회 논문집 전력기술부문
    • /
    • pp.252-254
    • /
    • 2007
  • This paper introduces the method of sequential clustering as a tool for the effective clustering of mass unit electrical systems. The interconnected network system retains information about the location of each line. With this information, this paper aims to carry out initial clustering through the transmission usage rate, compare the results of similarity measures for regional information with similarity measures for regional price, and introduce the technicalities of the clustering method. This transmission usage rate used power flow based on congestion costs and modified similarity measurements using the FCM algorithm. This paper also aims to prove the propriety of the proposed clustering method by comparing it with existing clustering methods that use the similarity measurement system. The proposed algorithm is demonstrated through the IEEE 39-bus RTS.

  • PDF

Combining Different Distance Measurements Methods with Dempster-Shafer-Theory for Recognition of Urdu Character Script

  • Khan, Yunus;Nagar, Chetan;Kaushal, Devendra S.
    • International Journal of Ocean System Engineering
    • /
    • 제2권1호
    • /
    • pp.16-23
    • /
    • 2012
  • In this paper we discussed a new methodology for Urdu Character Recognition system using Dempster-Shafer theory which can powerfully estimate the similarity ratings between a recognized character and sampling characters in the character database. Recognition of character is done by five probability calculation methods such as (similarity, hamming, linear correlation, cross-correlation, nearest neighbor) with Dempster-Shafer theory of belief functions. The main objective of this paper is to Recognition of Urdu letters and numerals through five similarity and dissimilarity algorithms to find the similarity between the given image and the standard template in the character recognition system. In this paper we develop a method to combine the results of the different distance measurement methods using the Dempster-Shafer theory. This idea enables us to obtain a single precision result. It was observed that the combination of these results ultimately enhanced the success rate.

Software Similarity Measurement based on Dependency Graph using Harmony Search

  • Yun, Ho Yeong;Joe, Yong Joon;Jung, Byung Ok;Shin, Dong myung;Bahng, Hyo Keun
    • 한국컴퓨터정보학회논문지
    • /
    • 제21권12호
    • /
    • pp.1-10
    • /
    • 2016
  • In this paper, we attempt to prevent certain cases by tracing a history and making genogram about open source software and its modification using similarity of source code. There are many areas which use open source software actively and widely, and open source software contributes their development. However, there are many unconscious cases like ignoring license or intellectual properties infringe which can lead litigation. To prevent such situation, we analyze source code similarity using program dependence graph which resembles subgraph isomorphism problem, a typical NP-complete problem. To solve subgraph isomorphism problem, we utilized harmony search of metaheuristic algorithm and compared its result with a genetic algorithm. For the future works, we represent open source software as program dependence graph and analyze their similarity.

순차적 클러스터링기법을 이용한 송전 계통의 지역별 그룹핑 (Regional Grouping of Transmission System Using the Sequential Clustering Technique)

  • 김현홍;이우남;박종배;신중린;김진호
    • 전기학회논문지
    • /
    • 제58권5호
    • /
    • pp.911-917
    • /
    • 2009
  • This paper introduces a sequential clustering technique as a tool for an effective grouping of transmission systems. The interconnected network system retains information about the location of each line. With this information, this paper aims to carry out initial clustering through the transmission usage rate, compare the similarity measures of regional information with the similarity measures of location price, and introduce the techniques of the clustering method. This transmission usage rate uses power flow based on congestion costs and similarity measurements using the FCM(Fuzzy C-Mean) algorithm. This paper also aims to prove the propriety of the proposed clustering method by comparing it with existing clustering methods that use the similarity measurement system. The proposed algorithm is demonstrated through the IEEE 39-bus RTS and Korea power system.

Measurement of graphs similarity using graph centralities

  • Cho, Tae-Soo;Han, Chi-Geun;Lee, Sang-Hoon
    • 한국컴퓨터정보학회논문지
    • /
    • 제23권12호
    • /
    • pp.57-64
    • /
    • 2018
  • In this paper, a method to measure similarity between two graphs is proposed, which is based on centralities of the graphs. The similarity between two graphs $G_1$ and $G_2$ is defined by the difference of distance($G_1$, $G_{R_1}$) and distance($G_2$, $G_{R_2}$), where $G_{R_1}$ and $G_{R_2}$ are set of random graphs that have the same number of nodes and edges as $G_1$ and $G_2$, respectively. Each distance ($G_*$, $G_{R_*}$) is obtained by comparing centralities of $G_*$ and $G_{R_*}$. Through the computational experiments, we show that it is possible to compare graphs regardless of the number of vertices or edges of the graphs. Also, it is possible to identify and classify the properties of the graphs by measuring and comparing similarities between two graphs.

Ray distance를 이용한 3차원 형상의 유사성 판단 (Similarity Measurement of 3D Shapes Using Ray Distances)

  • 황태진;정지훈;오헌영;이건우
    • 한국정밀공학회지
    • /
    • 제21권1호
    • /
    • pp.159-166
    • /
    • 2004
  • Custom-tailored products are meant by the products having various sizes and shapes to meet the customer's different tastes or needs. Thus fabrication of custom-tailored products inherently involves inefficiency. To minimize this inefficiency, a new paradigm is proposed in this work. In this paradigm, different parts are grouped together according to their sizes and shapes. Then, representative shape of each group is derived and it will be used as the work-piece from which the parts in the group are machined. Once a new product is ordered, the optimal work-piece is selected through making similarity comparisons of new product and each representative shape. Then an effective NC tool-path is generated to machine only the different portions between the work-piece and the ordered product. The efficient machining conditions are also derived from this shape difference. By machining only the different portions between the work-piece and the ordered product, it saves time. Similarity comparison starts with the determination of the closest pose between two shapes in consideration. The closest pose is derived by comparing the ray distances while one shape is virtually rotated with respect to the other. Shape similarity value and overall similarity value calculated from ray distances are used for grouping. A prototype system based on the proposed methodology has been implemented and applied to the grouping and machining of the shoe lasts of various shapes and sizes.

Ray distance를 이용한 3차원 형상의 유사성 판단 (Similarity Measurement of 3D Shapes Using Ray Distances)

  • 정지훈;황태진;오헌영;이건우
    • 한국정밀공학회:학술대회논문집
    • /
    • 한국정밀공학회 2003년도 춘계학술대회 논문집
    • /
    • pp.70-73
    • /
    • 2003
  • Custom-tailored products are meant by the products having various sizes and shapes to meet the customer's different tastes or needs. Thus fabrication of custom-tailored products inherently involves inefficiency. To minimize this inefficiency, a new paradigm is proposed in this work. In this paradigm. different paris are grouped together according to their sizes and shapes. Then, representative shape of each group is derived and it will be used as the work-piece from which the parts in the group are machined. Once a new product is ordered, the optimal work-piece is selected through making similarity comparisons of new product and each representative shape. Then an effective NC tool-path is generated to machine only the different portions between the work-piece and the ordered product. The efficient machining conditions are also derived from this shape difference. By machining only the different portions between the work-piece and the ordered product, it saves time. Similarity comparison starts with the determination of the closest pose between two shapes in consideration. The closest pose is derived by comparing the ray distances while one shape is virtually rotated with respect to the other. Shape similarity value and overall similarity value calculated from ray distances are used for grouping. A prototype system based on the proposed methodology has been implemented and applied to the grouping and machining of the shoe lasts of various shapes and sizes.

  • PDF

Utilizing Case-based Reasoning for Consumer Choice Prediction based on the Similarity of Compared Alternative Sets

  • SEO, Sang Yun;KIM, Sang Duck;JO, Seong Chan
    • The Journal of Asian Finance, Economics and Business
    • /
    • 제7권2호
    • /
    • pp.221-228
    • /
    • 2020
  • This study suggests an alternative to the conventional collaborative filtering method for predicting consumer choice, using case-based reasoning. The algorithm of case-based reasoning determines the similarity between the alternative sets that each subject chooses. Case-based reasoning uses the inverse of the normalized Euclidian distance as a similarity measurement. This normalized distance is calculated by the ratio of difference between each attribute level relative to the maximum range between the lowest and highest level. The alternative case-based reasoning based on similarity predicts a target subject's choice by applying the utility values of the subjects most similar to the target subject to calculate the utility of the profiles that the target subject chooses. This approach assumes that subjects who deliberate in a similar alternative set may have similar preferences for each attribute level in decision making. The result shows the similarity between comparable alternatives the consumers consider buying is a significant factor to predict the consumer choice. Also the interaction effect has a positive influence on the predictive accuracy. This implies the consumers who looked into the same alternatives can probably pick up the same product at the end. The suggested alternative requires fewer predictors than conjoint analysis for predicting customer choices.