• Title/Summary/Keyword: Cosine Similarity Analysis

Search Result 81, Processing Time 0.023 seconds

A Study of CBIR(Content-based Image Retrieval) Computer-aided Diagnosis System of Breast Ultrasound Images using Similarity Measures of Distance (거리 기반 유사도 측정을 통한 유방 초음파 영상의 내용 기반 검색 컴퓨터 보조 진단 시스템에 관한 연구)

  • Kim, Min-jeong;Cho, Hyun-chong
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.66 no.8
    • /
    • pp.1272-1277
    • /
    • 2017
  • To assist radiologists for the characterization of breast masses, Computer-aided Diagnosis(CADx) system has been studied. The CADx system can improve the diagnostic accuracy of radiologists by providing objective information about breast masses. Morphological and texture features were extracted from the breast ultrasound images. Based on extracted features, the CADx system retrieves masses that are similar to a query mass from a reference library using a k-nearest neighbor (k-NN) approach. Eight similarity measures of distance, Euclidean, Chebyshev(Minkowski family), Canberra, Lorentzian($F_2$ family), Wave Hedges, Motyka(Intersection family), and Cosine, Dice(Inner Product family) are evaluated by ROC(Receiver Operating Characteristic) analysis. The Inner Product family measure used with the k-NN classifier provided slightly higher performance for classification of malignant and benign masses than those with the Minkowski, $F_2$, and Intersection family measures.

An Effective Metric for Measuring the Degree of Web Page Changes (효과적인 웹 문서 변경도 측정 방법)

  • Kwon, Shin-Young;Kim, Sung-Jin;Lee, Sang-Ho
    • Journal of KIISE:Databases
    • /
    • v.34 no.5
    • /
    • pp.437-447
    • /
    • 2007
  • A variety of similarity metrics have been used to measure the degree of web page changes. In this paper, we first define criteria for web page changes to evaluate the effectiveness of the similarity metrics in terms of six important types of web page changes. Second, we propose a new similarity metric appropriate for measuring the degree of web page changes. Using real web pages and synthesized pages, we analyze the five existing metrics (i.e., the byte-wise comparison, the TF IDF cosine distance, the word distance, the edit distance, and the shingling) and ours under the proposed criteria. The analysis result shows that our metric represents the changes more effectively than other metrics. We expect that our study can help users select an appropriate metric for particular web applications.

Aircraft Motion Identification Using Sub-Aperture SAR Image Analysis and Deep Learning

  • Doyoung Lee;Duk-jin Kim;Hwisong Kim;Juyoung Song;Junwoo Kim
    • Korean Journal of Remote Sensing
    • /
    • v.40 no.2
    • /
    • pp.167-177
    • /
    • 2024
  • With advancements in satellite technology, interest in target detection and identification is increasing quantitatively and qualitatively. Synthetic Aperture Radar(SAR) images, which can be acquired regardless of weather conditions, have been applied to various areas combined with machine learning based detection algorithms. However, conventional studies primarily focused on the detection of stationary targets. In this study, we proposed a method to identify moving targets using an algorithm that integrates sub-aperture SAR images and cosine similarity calculations. Utilizing a transformer-based deep learning target detection model, we extracted the bounding box of each target, designated the area as a region of interest (ROI), estimated the similarity between sub-aperture SAR images, and determined movement based on a predefined similarity threshold. Through the proposed algorithm, the quantitative evaluation of target identification capability enhanced its accuracy compared to when training with the targets with two different classes. It signified the effectiveness of our approach in maintaining accuracy while reliably discerning whether a target is in motion.

Query Extension of Retrieve System Using Hangul Word Embedding and Apriori (한글 워드임베딩과 아프리오리를 이용한 검색 시스템의 질의어 확장)

  • Shin, Dong-Ha;Kim, Chang-Bok
    • Journal of Advanced Navigation Technology
    • /
    • v.20 no.6
    • /
    • pp.617-624
    • /
    • 2016
  • The hangul word embedding should be performed certainly process for noun extraction. Otherwise, it should be trained words that are not necessary, and it can not be derived efficient embedding results. In this paper, we propose model that can retrieve more efficiently by query language expansion using hangul word embedded, apriori, and text mining. The word embedding and apriori is a step expanding query language by extracting association words according to meaning and context for query language. The hangul text mining is a step of extracting similar answer and responding to the user using noun extraction, TF-IDF, and cosine similarity. The proposed model can improve accuracy of answer by learning the answer of specific domain and expanding high correlation query language. As future research, it needs to extract more correlation query language by analysis of user queries stored in database.

The big data analysis framework of information security policy based on security incidents

  • Jeong, Seong Hoon;Kim, Huy Kang;Woo, Jiyoung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.10
    • /
    • pp.73-81
    • /
    • 2017
  • In this paper, we propose an analysis framework to capture the trends of information security incidents and evaluate the security policy based on the incident analysis. We build a big data from news media collecting security incidents news and policy news, identify key trends in information security from this, and present an analytical method for evaluating policies from the point of view of incidents. In more specific, we propose a network-based analysis model that allows us to easily identify the trends of information security incidents and policy at a glance, and a cosine similarity measure to find important events from incidents and policy announcements.

A Tracking Method of Same Drug Sales Accounts through Similarity Analysis of Instagram Profiles and Posts

  • Eun-Young Park;Jiyeon Kim;Chang-Hoon Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.2
    • /
    • pp.109-118
    • /
    • 2024
  • With the increasing number of social media users worldwide, cases of social media being abused to perpetrate various crimes are increasing. Specifically, drug distribution through social media is emerging as a serious social problem. Using social media channels, the curiosity of teenagers regarding drugs is stimulated through clever marketing. Further, social media easily facilitates drug purchases due to the high accessibility of drug sellers and consumers. Among various social media platforms, we focused on Instagram, which is the most used social media platform by young adults aged 19 to 24 years in South Korea. We collected four types of information, including profile photos, introductions, posts in the form of images, and posts in the form of texts on Instagram; then, we analyzed the similarity among each type of collected information. The profile photos and posts in the form of image were analyzed for similarity based on the SSIM(Structural Simplicity Index Measure), while introductions and posts in the form of text were analyzed for similarity using Jaccard and Cosine similarity techniques. Through the similarity analysis, the similarity among various accounts for each collected information type was measured, and accounts with similarity above the significance level were determined as the same drug sales account. By performing logistic regression analysis on the aforementioned information types, we confirmed that except posts in image form, profile photos, introductions, and posts in the text form were valid information for tracking the same drug sales account.

A Comparative Study using Bibliometric Analysis Method on the Reformed Theology and Evangelicalism (개혁신학과 복음주의에 관한 계량서지학적 비교 연구)

  • Yoo, Yeong Jun;Lee, Jae Yun
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.29 no.3
    • /
    • pp.41-63
    • /
    • 2018
  • This study aimed at analyzing journals and index terms, authors of the reformed theology and evangelicalism, neutral theological position by using bibliometrical analyzing methods. The analyzing methods are average linkage and neighbor centralities, profile cosine similarities. Especially, when analyzing the relationship between authors, we interpreted the research topic by finding the key shared index terms between the authors. In the journal analysis results, 9 journals were largely clustered together in the two clusters of the reformed theology and evangelicalism, but Presbyterian Theological Quarterly that is thought to be a reformed journal was clustered in evangelical cluster. In the index terms analysis results of the clusters, the reformed theology and evangelicalism were key words representing the two clusters. In the authors' analysis results, we had 9 clusters and the Presbyterian theologian studying the reformed theology had the four clusters and the non-Presbyterian theologian had the 5 clusters. Therefore, we consistently had the two clusters of the reformed theology and evangelicalism in all the analysis of the journals and the index terms, the authors.

SNA-based Trend Analysis of Naval Ship Maintenance

  • Yoo, Jung-Min;Yoon, Soung-woong;Lee, Sang-Hoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.6
    • /
    • pp.165-174
    • /
    • 2019
  • Naval ship maintenance generally produces various issues for effective maintenance methods and procedures, because they have been composed by numerous modules and systems, and manual-oriented maintenance needed well-trained technicians who always busy to do many other works. In this paper, we adapt SNA scheme to the service procedure and trends of ROK naval ships' equipments. Various SNA algorithms are deployed which show lots of operating options, and we show analysis results that have enough potential improvement points for the maintainers.

The Classification of Arrhythmia Using Similarity Analysis Between Unit Patterns at ECG Signal (ECG 신호에서 단위패턴간 유사도분석을 이용한 부정맥 분류 알고리즘)

  • Bae, Jung-Hyoun;Lim, Seung-Ju;Kim, Jeong-Ju;Park, Sung-Dae;Kim, Jeong-Do
    • The KIPS Transactions:PartD
    • /
    • v.19D no.1
    • /
    • pp.105-112
    • /
    • 2012
  • Most methods for detecting PVC and APC require the measurement of accurate QRS complex, P wave and T wave. In this study, we propose new algorithm for detecting PVC and APC without using complex parameter and algorithms. Proposed algorithm have wide applicability to abnormal waveform by personal distinction and difference as well as all sorts of normal waveform on ECG. To achieve this, we separate ECG signal into each unit patterns and made a standard unit pattern by just using unit patterns which have normal R-R internal. After that, we detect PVC and APC by using similarity analysis for pattern matching between standard unit pattern and each unit patterns.

Measuring Similarity of Android Applications Using Method Reference Frequency and Manifest Information (메소드 참조 빈도와 매니페스트 정보를 이용한 안드로이드 애플리케이션들의 유사도 측정)

  • Kim, Gyoosik;Hamedani, Masoud Reyhani;Cho, Seong-je;Kim, Seong Baeg
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.13 no.3
    • /
    • pp.15-25
    • /
    • 2017
  • As the value and importance of softwares are growing up, software theft and piracy become a much larger problem. To tackle this problem, it is highly required to provide an accurate method for detecting software theft and piracy. Especially, while software theft is relatively easy in the case of Android applications (apps), screening illegal apps has not been properly performed in Android markets. In this paper, we propose a method to effectively measure the similarity between Android apps for detecting software theft at the executable file level. Our proposed method extracts method reference frequency and manifest information through static analysis of executable Android apps as the main features for similarity measurement. Each app is represented as an n-dimensional vectors with the features, and then cosine similarity is utilized as the similarity measure. We demonstrate the effectiveness of our proposed method by evaluating its accuracy in comparison with typical source code-based similarity measurement methods. As a result of the experiments for the Android apps whose source file and executable file are available side by side, we found that our similarity degree measured at the executable file level is almost equivalent to the existing well-known similarity degree measured at the source file level.