• 제목/요약/키워드: similarity

검색결과 8,140건 처리시간 0.033초

Top-${\kappa}$ 유사도 조인을 위한 샘플링 기반 알고리즘 (A Sampling-based Algorithm for Top-${\kappa}$ Similarity Joins)

  • 박종수
    • 한국정보과학회논문지:데이타베이스
    • /
    • 제41권4호
    • /
    • pp.256-261
    • /
    • 2014
  • Top-${\kappa}$ 유사도 조인 문제는 두 개의 입력 레코드 집합들에서 유사도를 기준한 상위 ${\kappa}$ 개의 레코드 쌍을 찾는 것이다. 샘플링 기법을 이용하여 상위 ${\kappa}$ 개의 유사도 조인 쌍을 반환하는 효율적인 알고리즘을 제안한다. 입력 레코드들의 표본에서 집합 유사도 조인들의 히스토그램을 구성하고, 상위 ${\kappa}$ 개의 조인 쌍을 위한 추정 유사도 한계치를 통계 추론으로 95% 신뢰 구간의 오차 한계 내에서 계산한다. 상위 ${\kappa}$ 개의 유사도 조인을 얻기 위하여 최소-히프 구조를 사용하는 일반 유사도 조인 알고리즘에 이 추정 한계치를 적용한다. 대 용량의 실제 데이터집합에서의 실험결과는 제안된 알고리즘의 좋은 성능을 보여준다.

모브랜드 제품-확장브랜드 제품간 유사성이 확장제품평가에 미치는 영향 - 프랜차이즈 브랜드를 중심으로 - (Influence of Product Similarity between Parent Brand and Extended Brand on Extended Product Evaluation - Focus on Franchise Brand -)

  • 김기석;신봉섭
    • 한국콘텐츠학회논문지
    • /
    • 제11권11호
    • /
    • pp.378-389
    • /
    • 2011
  • 본 연구는 프랜차이즈 브랜드가 다양한 제품군으로 브랜드를 확장하는 상황에서 모브랜드 제품과 확장브랜드 제품간 유사성 차이를 살펴보고, 각 확장제품과 모브랜드 제품간 유사성이 확장브랜드에 대한 태도에 어떠한 영향을 미치는지 규명해 보고자 하였다. 연구결과, 제품간 유사성은 확장제품에 따라 다르게 나타났다. 또한, 유사성이 높은 제품과 낮은 제품간 인식적 태도 및 행동적 태도를 살펴 본 결과, 유사성이 높은 제품은 인식적 태도, 행동적 태도 모두 높게 나타났다. 한편, 음식속성 유사성이 기술적 유사성에 비해 태도에 미치는 영향이 높은 것으로 나타났다. 이러한 연구결과들은 프랜차이즈 브랜드의 브랜드확장전략에 있어서 중요한 시사점을 제시해 준다.

Learning Discriminative Fisher Kernel for Image Retrieval

  • Wang, Bin;Li, Xiong;Liu, Yuncai
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제7권3호
    • /
    • pp.522-538
    • /
    • 2013
  • Content based image retrieval has become an increasingly important research topic for its wide application. It is highly challenging when facing to large-scale database with large variance. The retrieval systems rely on a key component, the predefined or learned similarity measures over images. We note that, the similarity measures can be potential improved if the data distribution information is exploited using a more sophisticated way. In this paper, we propose a similarity measure learning approach for image retrieval. The similarity measure, so called Fisher kernel, is derived from the probabilistic distribution of images and is the function over observed data, hidden variable and model parameters, where the hidden variables encode high level information which are powerful in discrimination and are failed to be exploited in previous methods. We further propose a discriminative learning method for the similarity measure, i.e., encouraging the learned similarity to take a large value for a pair of images with the same label and to take a small value for a pair of images with distinct labels. The learned similarity measure, fully exploiting the data distribution, is well adapted to dataset and would improve the retrieval system. We evaluate the proposed method on Corel-1000, Corel5k, Caltech101 and MIRFlickr 25,000 databases. The results show the competitive performance of the proposed method.

Retrieval of Scholarly Articles with Similar Core Contents

  • Liu, Rey-Long
    • International Journal of Knowledge Content Development & Technology
    • /
    • 제7권3호
    • /
    • pp.5-27
    • /
    • 2017
  • Retrieval of scholarly articles about a specific research issue is a routine job of researchers to cross-validate the evidence about the issue. Two articles that focus on a research issue should share similar terms in their core contents, including their goals, backgrounds, and conclusions. In this paper, we present a technique CCSE ($\underline{C}ore$ $\underline{C}ontent$ $\underline{S}imilarity$ $\underline{E}stimation$) that, given an article a, recommends those articles that share similar core content terms with a. CCSE works on titles and abstracts of articles, which are publicly available. It estimates and integrates three kinds of similarity: goal similarity, background similarity, and conclusion similarity. Empirical evaluation shows that CCSE performs significantly better than several state-of-the-art techniques in recommending those biomedical articles that are judged (by domain experts) to be the ones whose core contents focus on the same research issues. CCSE works for those articles that present research background followed by main results and discussion, and hence it may be used to support the identification of the closely related evidence already published in these articles, even when only titles and abstracts of the articles are available.

funcGNN과 Siamese Network의 코드 유사성 분석 성능비교 (Comparison of Code Similarity Analysis Performance of funcGNN and Siamese Network)

  • 최동빈;조인수;박용범
    • 반도체디스플레이기술학회지
    • /
    • 제20권3호
    • /
    • pp.113-116
    • /
    • 2021
  • As artificial intelligence technologies, including deep learning, develop, these technologies are being introduced to code similarity analysis. In the traditional analysis method of calculating the graph edit distance (GED) after converting the source code into a control flow graph (CFG), there are studies that calculate the GED through a trained graph neural network (GNN) with the converted CFG, Methods for analyzing code similarity through CNN by imaging CFG are also being studied. In this paper, to determine which approach will be effective and efficient in researching code similarity analysis methods using artificial intelligence in the future, code similarity is measured through funcGNN, which measures code similarity using GNN, and Siamese Network, which is an image similarity analysis model. The accuracy was compared and analyzed. As a result of the analysis, the error rate (0.0458) of the Siamese network was bigger than that of the funcGNN (0.0362).

Similarity measurement based on Min-Hash for Preserving Privacy

  • Cha, Hyun-Jong;Yang, Ho-Kyung;Song, You-Jin
    • International Journal of Advanced Culture Technology
    • /
    • 제10권2호
    • /
    • pp.240-245
    • /
    • 2022
  • Because of the importance of the information, encryption algorithms are heavily used. Raw data is encrypted and secure, but problems arise when the key for decryption is exposed. In particular, large-scale Internet sites such as Facebook and Amazon suffer serious damage when user data is exposed. Recently, research into a new fourth-generation encryption technology that can protect user-related data without the use of a key required for encryption is attracting attention. Also, data clustering technology using encryption is attracting attention. In this paper, we try to reduce key exposure by using homomorphic encryption. In addition, we want to maintain privacy through similarity measurement. Additionally, holistic similarity measurements are time-consuming and expensive as the data size and scope increases. Therefore, Min-Hash has been studied to efficiently estimate the similarity between two signatures Methods of measuring similarity that have been studied in the past are time-consuming and expensive as the size and area of data increases. However, Min-Hash allowed us to efficiently infer the similarity between the two sets. Min-Hash is widely used for anti-plagiarism, graph and image analysis, and genetic analysis. Therefore, this paper reports privacy using homomorphic encryption and presents a model for efficient similarity measurement using Min-Hash.

금형 기반 진동 신호 패턴의 유사도 분석을 통한 사출성형공정 변화 감지에 대한 연구 (A Study on Detecting Changes in Injection Molding Process through Similarity Analysis of Mold Vibration Signal Patterns)

  • 김종선
    • Design & Manufacturing
    • /
    • 제17권3호
    • /
    • pp.34-40
    • /
    • 2023
  • In this study, real-time collection of mold vibration signals during injection molding processes was achieved through IoT devices installed on the mold surface. To analyze changes in the collected vibration signals, injection molding was performed under six different process conditions. Analysis of the mold vibration signals according to process conditions revealed distinct trends and patterns. Based on this result, cosine similarity was applied to compare pattern changes in the mold vibration signals. The similarity in time and acceleration vector space between the collected data was analyzed. The results showed that under identical conditions for all six process settings, the cosine similarity remained around 0.92±0.07. However, when different process conditions were applied, the cosine similarity decreased to the range of 0.47±0.07. Based on these results, a cosine similarity threshold of 0.60~0.70 was established. When applied to the analysis of mold vibration signals, it was possible to determine whether the molding process was stable or whether variations had occurred due to changes in process conditions. This establishes the potential use of cosine similarity based on mold vibration signals in future applications for real-time monitoring of molding process changes and anomaly detection.

R&D과제의 기술분류를 이용한 사업간 유사도 분석 기법에 관한 연구 (A study on Similarity analysis of National R&D Programs using R&D Project's technical classification)

  • 김주호;김영자;김종배
    • 디지털콘텐츠학회 논문지
    • /
    • 제13권3호
    • /
    • pp.317-324
    • /
    • 2012
  • 최근 R&D 투자효율성 제고를 목표로 사업 간의 유사중복 조정에 대한 중요성이 강조되고 있으나, 과제 혹은 예산요구서 내용 등을 텍스트 기반으로 비교하는 기존 유사검색 방식은 내용의 품질 편차 등으로 인해 유의미한 유사성 도출에 제한점이 있다. 이러한 텍스트 기반의 키워드 추출을 통한 유사검색 한계성을 극복하기 위한 방안으로 본 연구에서는 사업 간 유사도 분석 시 과제의 기술분류를 활용한다. 국가R&D사업 조사 분석 시 수집된 과제들의 과학기술표준분류를 추출하여 사업별 고유벡터 모형을 생성 후 이를 이용하여 코사인 기반, 유클리디안 거리기반 알고리즘을 통해 각 사업 간 유사도를 측정하였으며 기존 키워드 추출방식으로 유사도를 측정한 결과와의 비교를 통해 연구 효율성을 검증하였다.

협력필터링의 데이터 희소성 해결을 위한 자카드 지수 반영의 유사도 성능 분석 (Performance Analysis of Similarity Reflecting Jaccard Index for Solving Data Sparsity in Collaborative Filtering)

  • 이수정
    • 컴퓨터교육학회논문지
    • /
    • 제19권4호
    • /
    • pp.59-66
    • /
    • 2016
  • 협력 필터링 시스템에서 데이터 희소성 문제의 해결을 위해 공통평가항목수를 반영하는 방법이 연구되었다. 이러한 방법으로 널리 알려진 자카드 지수는 기존의 유사도 척도와 결합되어 성능을 개선할 수 있었다. 그러나, 다양한 데이터 환경에서 여러 유사도 척도들과 각각 결합했을 때의 성능 개선 효과에 대한 분석 연구는 미미하므로, 본 연구는 이에 대한 분석을 목적으로 한다. 우선 자카드 지수 자체를 유사도 척도로 사용했을때 희소한 데이터셋 상에서 전통적인 척도들보다 월등한 예측 성능을 보였고 추천 성능도 매우 우수하였다. 자카드 지수를 결합함으로써 기존 유사도 척도는 데이터 특성에 상관없이 성능이 대개 향상되었고, 특히 코사인 유사도는 희소한 데이터셋에서 가장 큰 향상을 이루었으나, 평균차이 제곱(Mean Squared Difference)의 유사도는 밀집된 데이터셋에서 오히려 저하된 예측 성능을 보였다. 따라서, 자카드 지수를 결합하여 사용하기 위해 데이터 환경 특성과 유사도 척도를 고려할 필요가 있다.

문화적 유사성이 의료관광산업에 미치는 영향에 관한 연구 (The Cultural Similarity Effects on the Industry of Medical Tourism)

  • 장준;이훈영
    • 산경연구논집
    • /
    • 제9권1호
    • /
    • pp.67-76
    • /
    • 2018
  • Purpose - With the worldwide aging problem and the development of globalization, customers prefer to seek affordable medical services with the higher quality overseas. This new trend has urged some destination countries to improve their services for the more competitive advantages over other countries. Literature research indicate that medical quality and cost may be the key factors influencing global patients' decisions. In the international environment, however, medical tourism destinations are selected due to cultural similarity between the hosting country and the customers' own country. The more similarity perceived between the two countries leads foreign patients to choose the considering country as the destination for medical tourism. However, little research has been conducted on this topic. Thus, we empirically investigate how cultural similarity influences Chinese medical customers' choice of the destinations. We also consider the factors related to medical competency and travel attribute which might affect customers' decisions along with some moderating roles of disease types. Research design, data, and methodology - We proposed a research model in order to confirm the relations among different variables of cultural similarity, medical competency, travel attractiveness, disease types, and destination choice. The questionnaire survey is processed in the more economically developed regions of China such as Beijing, Shanghai, and Jiangsu. Conditional logit regression is applied to analyze the data of 881. Results - Results indicate that cultural similarity is the important predictor of Chinese customers' decision to select a medical country. However, the effects of cultural similarity vary according to the disease types. We also find that medical competency and travel attractiveness influence their decisions with the moderating role of disease types. Conclusions - Cultural similarity is the important factor that influences Chinese potential medical tourists' decisions to select a destination. Marketing managers should consider the effects of cultural similarity when developing strategies for attracting Chinese medical tourists. Since medical competency and travel attractiveness are still the critical key elements for them to evaluate the destination countries, it is necessary to continuously improve medical service quality and facilities. The results also recommend that medical managers should sharpen their marketing strategies by segmenting Chinese potential customers in terms of disease types.