• Title/Summary/Keyword: Similarity Measurement

Search Result 352, Processing Time 0.023 seconds

Automatic Thresholding Method using Cumulative Similarity Measurement for Unsupervised Change Detection of Multispectral and Hyperspectral Images (누적 유사도 측정을 이용한 자동 임계값 결정 기법 - 다중분광 및 초분광영상의 무감독 변화탐지를 목적으로)

  • Kim, Dae-Sung;Kim, Hyung-Tae
    • Korean Journal of Remote Sensing
    • /
    • v.24 no.4
    • /
    • pp.341-349
    • /
    • 2008
  • This study proposes new automatic thresholding method, which is important step for detecting binary change/non-change information using satellite images. Result value through pixel-based similarity measurement is calculated cumulatively with regular interval, and thresholding is pointed at the steep slope position. The proposed method is assessed in comparison with expectation-maximization algorithm and coner method using synthetic images, ALI images, and Hyperion images. Throughout the results, we validated that our method can guarantee the similar accuracy with previous algorithms. It is simpler than EM algorithm, and can be applied to the binormal histogram unlike the coner method.

Developing an Alias Management Method based on Word Similarity Measurement for POI Application

  • Choi, Jihye;Lee, Jiyeong
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.37 no.2
    • /
    • pp.81-89
    • /
    • 2019
  • As the need for the integration of administrative datasets and address information increases, there is also growing interest in POI (Point of Interest) data as a source of location information across applications and platforms. The purpose of this study is to develop an alias database management method for efficient POI searching, based on POI data representing position. First, we determine the attributes of POI alias data as it is used variously by individual users. When classifying aliases of POIs, we excluded POIs in which the typo and names are all in English alphabet. The attributes of POI aliases are classified into four categories, and each category is reclassified into three classes according to the strength of the attributes. We then define the quality of POI aliases classified in this study through experiments. Based on the four attributes of POI defined in this study, we developed a method of managing one POI alias through and integrated method composed of word embedding and a similarity measurement. Experimental results of the proposed POI alias management method show that it is possible to utilize the algorithm developed in this study if there are small numbers of aliases in each POI with appropriate POI attributes defined in this study.

Assessment of performance of machine learning based similarities calculated for different English translations of Holy Quran

  • Al Ghamdi, Norah Mohammad;Khan, Muhammad Badruddin
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.4
    • /
    • pp.111-118
    • /
    • 2022
  • This research article presents the work that is related to the application of different machine learning based similarity techniques on religious text for identifying similarities and differences among its various translations. The dataset includes 10 different English translations of verses (Arabic: Ayah) of two Surahs (chapters) namely, Al-Humazah and An-Nasr. The quantitative similarity values for different translations for the same verse were calculated by using the cosine similarity and semantic similarity. The corpus went through two series of experiments: before pre-processing and after pre-processing. In order to determine the performance of machine learning based similarities, human annotated similarities between translations of two Surahs (chapters) namely Al-Humazah and An-Nasr were recorded to construct the ground truth. The average difference between the human annotated similarity and the cosine similarity for Surah (chapter) Al-Humazah was found to be 1.38 per verse (ayah) per pair of translation. After pre-processing, the average difference increased to 2.24. Moreover, the average difference between human annotated similarity and semantic similarity for Surah (chapter) Al-Humazah was found to be 0.09 per verse (Ayah) per pair of translation. After pre-processing, it increased to 0.78. For the Surah (chapter) An-Nasr, before preprocessing, the average difference between human annotated similarity and cosine similarity was found to be 1.93 per verse (Ayah), per pair of translation. And. After pre-processing, the average difference further increased to 2.47. The average difference between the human annotated similarity and the semantic similarity for Surah An-Nasr before preprocessing was found to be 0.93 and after pre-processing, it was reduced to 0.87 per verse (ayah) per pair of translation. The results showed that as expected, the semantic similarity was proven to be better measurement indicator for calculation of the word meaning.

Measurement of Document Similarity using Word and Word-Pair Frequencies (단어 및 단어쌍 별 빈도수를 이용한 문서간 유사도 측정)

  • 김혜숙;박상철;김수형
    • Proceedings of the IEEK Conference
    • /
    • 2003.07d
    • /
    • pp.1311-1314
    • /
    • 2003
  • In this paper, we propose a method to measure document similarity. First, we have exploited single-term method that extracts nouns by using a lexical analyzer as a preprocessing step to match one index to one noun. In spite of irrelevance between documents, possibility of increasing document similarity is high with this method. For this reason, a term-phrase method has been reported. This method constructs co-occurrence between two words as an index to measure document similarity. In this paper, we tried another method that combine these two methods to compensate the problems in these two methods. Six types of features are extracted from two input documents, and they are fed into a neural network to calculate the final value of document similarity. Reliability of our method has been proved by an experiment of document retrieval.

  • PDF

A Measurement of Self-Similarity Characteristic and Hurst Parameter on Real Time Operation Network (실시간 운영중인 네트워크 상에서 Self-Similarity 특성 및 Hurst 파라미터 측정)

  • 진성호;임재홍
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 1999.11a
    • /
    • pp.266-269
    • /
    • 1999
  • 네트워크를 설계하고 서비스를 구현하는데 있어서 중요한 변수중의 하나는 트래픽의 특성을 파악하는 것이다. 기존의 트래픽 예측과 분석으로 Poisson 또는 Markovian을 기본으로 하는 모델을 사용했을 경우는 단기간의 의존성을 고려한 결과로써 실제 관측된 트래픽의 결과와는 상당히 다르다는 것이 밝혀졌다. 따라서 최근 실제 트래픽 모델과 유사한 모델로서 Self-Similarity 특성을 이용한 접근법이 대두되고 있다. 본 논문에서는 Self-Similarity의 장기간 의존성을 나타내기 위해서 실제 네트워크에서 측정한 데이터를 사용하여 Hurst 파라미터 H의 값을 추정하고 실시간 운영중인 네트워크 상에서 어느 정도의 Self-Similarity특성을 가지고 있는지 분석한다

  • PDF

Measurement of Document Similarity using Term/Term-pair Features and Neural Network (단어/단어쌍 특징과 신경망을 이용한 두 문서간 유사도 측정)

  • Kim Hye Sook;Park Sang Cheol;Kim Soo Hyung
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.12
    • /
    • pp.1660-1671
    • /
    • 2004
  • This paper proposes a method for measuring document similarity between two documents. One of the most significant ideas of the method is to estimate the degree of similarity between two documents based on the frequencies of terms and term-pair, existing in both the two documents. In contrast to conventional methods which takes only one feature into account, the proposed method considers several features at the same time and meatures the similarity using a neural network. To prove the superiority of our method, two experiments have been conducted. One is to verify whether the two input documents are from the same document or not. The other is a problem of information retrieval with a document as the query against a large number of documents. In both the two experiments, the proposed method shows higher accuracy than two conventional methods, Cosine similarity measurement and a term-pair method.

Measurement and Simulation of Wide-area Frequency in US Eastern Interconnected Power System

  • Kook, Kyung Soo;Liu, Yilu
    • Journal of Electrical Engineering and Technology
    • /
    • v.8 no.3
    • /
    • pp.472-477
    • /
    • 2013
  • An internet-based, real-time GPS synchronized wide-area power system frequency monitoring network(FNET) has been monitoring wide-area power system frequency in continuous time in the United States. This paper analyzes the FNET measurement to the verified disturbances in the US eastern interconnected power system and simulates it using the dynamic system model. By comparing the frequency measurements with its simulation results to the same disturbances in detail, this paper finds that the sequence of monitoring points to detect the frequency fluctuation caused by the disturbances is matched well in the measured data and the simulation results. The similarity comparison index is also proposed to quantify the similarity of the compared cases. The dynamic model based simulation result is expected to compensate for the lack of FNET measurement in its applications.

Semantic Image Retrieval Using Color Distribution and Similarity Measurement in WordNet (컬러 분포와 WordNet상의 유사도 측정을 이용한 의미적 이미지 검색)

  • Choi, Jun-Ho;Cho, Mi-Young;Kim, Pan-Koo
    • The KIPS Transactions:PartB
    • /
    • v.11B no.4
    • /
    • pp.509-516
    • /
    • 2004
  • Semantic interpretation of image is incomplete without some mechanism for understanding semantic content that is not directly visible. For this reason, human assisted content-annotation through natural language is an attachment of textual description to image. However, keyword-based retrieval is in the level of syntactic pattern matching. In other words, dissimilarity computation among terms is usually done by using string matching not concept matching. In this paper, we propose a method for computerized semantic similarity calculation In WordNet space. We consider the edge, depth, link type and density as well as existence of common ancestors. Also, we have introduced method that applied similarity measurement on semantic image retrieval. To combine wi#h the low level features, we use the spatial color distribution model. When tested on a image set of Microsoft's 'Design Gallery Line', proposed method outperforms other approach.

Same music file recognition method by using similarity measurement among music feature data (음악 특징점간의 유사도 측정을 이용한 동일음원 인식 방법)

  • Sung, Bo-Kyung;Chung, Myoung-Beom;Ko, Il-Ju
    • Journal of the Korea Society of Computer and Information
    • /
    • v.13 no.3
    • /
    • pp.99-106
    • /
    • 2008
  • Recently, digital music retrieval is using in many fields (Web portal. audio service site etc). In existing fields, Meta data of music are used for digital music retrieval. If Meta data are not right or do not exist, it is hard to get high accurate retrieval result. Contents based information retrieval that use music itself are researched for solving upper problem. In this paper, we propose Same music recognition method using similarity measurement. Feature data of digital music are extracted from waveform of music using Simplified MFCC (Mel Frequency Cepstral Coefficient). Similarity between digital music files are measured using DTW (Dynamic time Warping) that are used in Vision and Speech recognition fields. We success all of 500 times experiment in randomly collected 1000 songs from same genre for preying of proposed same music recognition method. 500 digital music were made by mixing different compressing codec and bit-rate from 60 digital audios. We ploved that similarity measurement using DTW can recognize same music.

  • PDF

Implementation of A Plagiarism Detecting System with Sentence and Syntactic Word Similarities (문장 및 어절 유사도를 이용한 표절 탐지 시스템 구현)

  • Maeng, Joosoo;Park, Ji Su;Shon, Jin Gon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.3
    • /
    • pp.109-114
    • /
    • 2019
  • The similarity detecting method that is basically used in most plagiarism detecting systems is to use the frequency of shared words based on morphological analysis. However, this method has limitations on detecting accurate degree of similarity, especially when similar words concerning the same topics are used, sentences are partially separately excerpted, or postpositions and endings of words are similar. In order to overcome this problem, we have designed and implemented a plagiarism detecting system that provides more reliable similarity information by measuring sentence similarity and syntactic word similarity in addition to the conventional word similarity. We have carried out a comparison of on our system with a conventional system using only word similarity. The comparative experiment has shown that our system can detect plagiarized document that the conventional system can detect or cannot.