• 제목/요약/키워드: Cosine Similarity

검색결과 188건 처리시간 0.027초

The Design of Rescreening System for Social Network (소셜 네트워크 재검색 시스템의 설계)

  • Sim, Gyu Ri;Kim, Dong Hyun
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 한국컴퓨터정보학회 2022년도 제66차 하계학술대회논문집 30권2호
    • /
    • pp.139-140
    • /
    • 2022
  • 최근 소셜 네트워크 서비스 시장이 급속히 성장함에 따라 SNS 사용자 또한 지속적으로 증가하고 있다. 그러나, 광고성 게시물도 함께 증가함에 따라 해시태그 기반 검색의 정확도가 감소하는 문제점을 가지고 있다. 본 연구에서는 SNS 검색 활동의 정확도와 효율성을 개선하기 위하여 SNS 해시태그 기반 재검색 시스템을 제안한다. 제안 시스템을 적용하면 SNS 사용자의 검색 활동의 정확도와 효율성이 증가할 것으로 기대된다.

  • PDF

Clustering-based Statistical Machine Translation Using Syntactic Structure and Word Similarity (문장구조 유사도와 단어 유사도를 이용한 클러스터링 기반의 통계기계번역)

  • Kim, Han-Kyong;Na, Hwi-Dong;Li, Jin-Ji;Lee, Jong-Hyeok
    • Journal of KIISE:Software and Applications
    • /
    • 제37권4호
    • /
    • pp.297-304
    • /
    • 2010
  • Clustering method which based on sentence type or document genre is a technique used to improve translation quality of SMT(statistical machine translation) by domain-specific translation. But there is no previous research using sentence type and document genre information simultaneously. In this paper, we suggest an integrated clustering method that classifying sentence type by syntactic structure similarity and document genre by word similarity information. We interpolated domain-specific models from clusters with general models to improve translation quality of SMT system. Kernel function and cosine measures are applied to calculate structural similarity and word similarity. With these similarities, we used machine learning algorithms similar to K-means to clustering. In Japanese-English patent translation corpus, we got 2.5% point relative improvements of translation quality at optimal case.

Personalized Document Summarization Using NMF and Clustering (군집과 비음수 행렬 분해를 이용한 개인화된 문서 요약)

  • Park, Sun
    • Journal of Advanced Navigation Technology
    • /
    • 제13권1호
    • /
    • pp.151-155
    • /
    • 2009
  • We proposes a new method using the non-negative matrix factorization (NMF) and clustering method to extract the sentences for personalized document summarization. The proposed method uses clustering method for retrieving documents to extract sentences which are well reflected topics and sub-topics in document. Beside it can extract sentences with respect to query which are well reflected user interesting by using the inherent semantic features in document by NMF. The experimental results shows that the proposed method achieves better performance than other methods use the similarity and the NMF.

  • PDF

Market Approach to Valuation Based on Technology Transfer Cases in Korea

  • Kim, Sang-Gook;Lee, Hyun;Park, Hyun-Woo
    • Asian Journal of Innovation and Policy
    • /
    • 제2권1호
    • /
    • pp.97-122
    • /
    • 2013
  • This study secured comparable sales transaction information of technology transfer corresponding to an active market conditions and proposes a method to assess the similarity of technologies with regard to comparability of technology transfer based on these cases information. In order to analyze the association and similarity between target technology and sales transactions, it proposes the significant factors affecting royalty decision and the cosine coefficient method by industry categories. It also proposes the method to adjust royalty, which means that this method unlike the conventional method provides clear standards to valuators in order to revise royalty. Therefore, it offers a solution to the difficulties of applying the market approach for a lot of valuators that have wanted to apply it and objective method to enhance the reliability of the value of intangible asset evaluated by the market approach.

The Design of Technical Interview System for Computer Engineering based Similarity (유사도 기반 컴퓨터공학 기술 면접 시스템의 설계)

  • Dong Hyun Lee;Dong Hyun Kim
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 한국컴퓨터정보학회 2023년도 제68차 하계학술대회논문집 31권2호
    • /
    • pp.351-352
    • /
    • 2023
  • 컴퓨터공학 분야 개발자를 채용할 때 대다수의 기업에서 일반 면접과는 달리 전공 분야 역량 파악을 위한 컴퓨터공학 기술 면접을 함께 진행한다. 컴퓨터공학 면접자의 기술 면접을 지원하기 위하여 이 논문에서는 컴퓨터공학 핵심 개념에 대한 면접자 답변의 정확도를 코사인 유사도를 이용하여 평가 후 결과를 알려주는 시스템을 제안한다. 제안한 시스템을 이용하면 개발자들의 컴퓨터공학 핵심 개념의 기술 면접 정확도를 향상시킬 수 있을 것으로 기대된다.

  • PDF

A Comparison between Factor Structure and Semantic Representation of Personality Test Items Using Latent Semantic Analysis (잠재의미분석을 활용한 성격검사문항의 의미표상과 요인구조의 비교)

  • Park, Sungjoon;Park, Heeyoung;Kim, Cheongtag
    • Korean Journal of Cognitive Science
    • /
    • 제30권3호
    • /
    • pp.133-156
    • /
    • 2019
  • To investigate how personality test items are understood by participants, their semantic representations were explored by Latent Semantic Analysis, In this thesis, Semantic Similarity Matrix was proposed, which contains cosine similarity of semantic representations between test items and personality traits. The matrix was compared to traditional factor loading matrix. In preliminary study, semantic space was constructed from the passages describing the five traits, collected from 154 undergraduate participants. In study 1, positive correlation was observed between the factor loading matrix of Korean shorten BFI and its semantic similarity matrix. In study 2, short personality test was constructed from semantic similarity matrix, and observed that its factor loading matrix was positively correlated with the semantic similarity matrix as well. In conclusion, the results implies that the factor structure of personality test can be inferred from semantic similarity between the items and factors.

A Study of CBIR(Content-based Image Retrieval) Computer-aided Diagnosis System of Breast Ultrasound Images using Similarity Measures of Distance (거리 기반 유사도 측정을 통한 유방 초음파 영상의 내용 기반 검색 컴퓨터 보조 진단 시스템에 관한 연구)

  • Kim, Min-jeong;Cho, Hyun-chong
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • 제66권8호
    • /
    • pp.1272-1277
    • /
    • 2017
  • To assist radiologists for the characterization of breast masses, Computer-aided Diagnosis(CADx) system has been studied. The CADx system can improve the diagnostic accuracy of radiologists by providing objective information about breast masses. Morphological and texture features were extracted from the breast ultrasound images. Based on extracted features, the CADx system retrieves masses that are similar to a query mass from a reference library using a k-nearest neighbor (k-NN) approach. Eight similarity measures of distance, Euclidean, Chebyshev(Minkowski family), Canberra, Lorentzian($F_2$ family), Wave Hedges, Motyka(Intersection family), and Cosine, Dice(Inner Product family) are evaluated by ROC(Receiver Operating Characteristic) analysis. The Inner Product family measure used with the k-NN classifier provided slightly higher performance for classification of malignant and benign masses than those with the Minkowski, $F_2$, and Intersection family measures.

An Effective Metric for Measuring the Degree of Web Page Changes (효과적인 웹 문서 변경도 측정 방법)

  • Kwon, Shin-Young;Kim, Sung-Jin;Lee, Sang-Ho
    • Journal of KIISE:Databases
    • /
    • 제34권5호
    • /
    • pp.437-447
    • /
    • 2007
  • A variety of similarity metrics have been used to measure the degree of web page changes. In this paper, we first define criteria for web page changes to evaluate the effectiveness of the similarity metrics in terms of six important types of web page changes. Second, we propose a new similarity metric appropriate for measuring the degree of web page changes. Using real web pages and synthesized pages, we analyze the five existing metrics (i.e., the byte-wise comparison, the TF IDF cosine distance, the word distance, the edit distance, and the shingling) and ours under the proposed criteria. The analysis result shows that our metric represents the changes more effectively than other metrics. We expect that our study can help users select an appropriate metric for particular web applications.

Similarity Measurement Between Titles and Abstracts Using Bijection Mapping and Phi-Correlation Coefficient

  • John N. Mlyahilu;Jong-Nam Kim
    • Journal of the Institute of Convergence Signal Processing
    • /
    • 제23권3호
    • /
    • pp.143-149
    • /
    • 2022
  • This excerpt delineates a quantitative measure of relationship between a research title and its respective abstract extracted from different journal articles documented through a Korean Citation Index (KCI) database published through various journals. In this paper, we propose a machine learning-based similarity metric that does not assume normality on dataset, realizes the imbalanced dataset problem, and zero-variance problem that affects most of the rule-based algorithms. The advantage of using this algorithm is that, it eliminates the limitations experienced by Pearson correlation coefficient (r) and additionally, it solves imbalanced dataset problem. A total of 107 journal articles collected from the database were used to develop a corpus with authors, year of publication, title, and an abstract per each. Based on the experimental results, the proposed algorithm achieved high correlation coefficient values compared to others which are cosine similarity, euclidean, and pearson correlation coefficients by scoring a maximum correlation of 1, whereas others had obtained non-a-number value to some experiments. With these results, we found that an effective title must have high correlation coefficient with the respective abstract.

Aircraft Motion Identification Using Sub-Aperture SAR Image Analysis and Deep Learning

  • Doyoung Lee;Duk-jin Kim;Hwisong Kim;Juyoung Song;Junwoo Kim
    • Korean Journal of Remote Sensing
    • /
    • 제40권2호
    • /
    • pp.167-177
    • /
    • 2024
  • With advancements in satellite technology, interest in target detection and identification is increasing quantitatively and qualitatively. Synthetic Aperture Radar(SAR) images, which can be acquired regardless of weather conditions, have been applied to various areas combined with machine learning based detection algorithms. However, conventional studies primarily focused on the detection of stationary targets. In this study, we proposed a method to identify moving targets using an algorithm that integrates sub-aperture SAR images and cosine similarity calculations. Utilizing a transformer-based deep learning target detection model, we extracted the bounding box of each target, designated the area as a region of interest (ROI), estimated the similarity between sub-aperture SAR images, and determined movement based on a predefined similarity threshold. Through the proposed algorithm, the quantitative evaluation of target identification capability enhanced its accuracy compared to when training with the targets with two different classes. It signified the effectiveness of our approach in maintaining accuracy while reliably discerning whether a target is in motion.