• Title/Summary/Keyword: data similarity

Search Result 2,059, Processing Time 0.035 seconds

Similarity Measure Construction of the Fuzzy Set for the Reliable Data Selection (신뢰성 있는 정보의 추출을 위한 퍼지집합의 유사측도 구성)

  • Lee Sang-Hyuk
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.9C
    • /
    • pp.854-859
    • /
    • 2005
  • We construct the fuzzy entropy for measuring of uncertainty with the help of relation between distance measure and similarity measure. Proposed fuzzy entropy is constructed through distance measure. In this study, the distance measure is used Hamming distance measure. Also for the measure of similarity between fuzzy sets or crisp sets, we construct similarity measure through distance measure, and the proposed 려zzy entropies and similarity measures are proved.

An Experimental Study on the Similarity of Confined Coaxial Jets (동축 이중제한분류의 상사성에 대한 실험적 연구)

  • 사용철;이태환;이준식
    • Transactions of the Korean Society of Mechanical Engineers
    • /
    • v.19 no.5
    • /
    • pp.1291-1299
    • /
    • 1995
  • In confined coaxial jets, the flow-mixing characteristics depend on the initial conditions at the nozzle outlet such as velocity ratio and nozzle radius ratio. In this study, nozzle ratio(inner/outer) was 0.3. Longitudinal axial velocity, turbulent intensity and Reynolds shear stress were measured by CTA. Measurements were made from the duct inlet to the region where similarity solution could exist. This study investigated flow charicteristics according to the variation of similitude parameter which was derived from the theory of Craya-Cutet. The range of similarity region depends on the variation of the similitude patameter. The form factor obtained from the axial velocity profile in the similarity region was constant. The higher the similitude parameter, the wider the spread rate of the jets. Due to this fact, the similarity conditions developed more quickly and the region where the similarity holds became narrow. Present experimental data confirmed the validity of Craya-Curtet theory.

Parentage Identification of 'Daebong' Grape (Vitis spp.) Using RAPD Analysis

  • Kim, Seung-Heui;Jeong, Jae-Hun;Kim, Seon-Kyu;Paek, Kee-Yoeup
    • Journal of Plant Biotechnology
    • /
    • v.4 no.2
    • /
    • pp.67-70
    • /
    • 2002
  • The RAPD data were used to assess genetic similarity among f grape cultivars. Of the 100 random primers tested on genomic DNA, 10 primers could be selected for Benetic analysis, and the selected primers generated a total of 115 distinct amplification fragments. A similarity matrix was constructed on the basis of the presence or absence of bands. The 7 grape cultivars analyzed with UPGMA were clustered into two groups of A and B. The similarity coefficient value of cultivars was high. The mean similarity index for all pairwise comparisons was 0.851, and ranged from 0.714 ('Rosaki' and 'Black Olympia') to 0.988 ('Kyoho' and 'Daebong'). After due consideration of differences in cultural and morphological characteristics of these two theoretically identical cultivars, it could be deduced that 'Daebong' is a bud sport of 'Kyoho' cultivar.

Seabed Classification Using the K-L (Karhunen-Lo$\grave{e}$ve) Transform of Chirp Acoustic Profiling Data: An Effective Approach to Geoacoustic Modeling (광역주파수 음향반사자료의 K-L 변환을 이용한 해저면 분류: 지질음향 모델링을 위한 유용한 방법)

  • Chang, Jae-Kyeong;Kim, Han-Joon;Jou, Hyeong-Tae;Suk, Bong-Chool;Park, Gun-Tae;Yoo, Hai-Soo;Yang, Sung-Jin
    • The Sea:JOURNAL OF THE KOREAN SOCIETY OF OCEANOGRAPHY
    • /
    • v.3 no.3
    • /
    • pp.158-164
    • /
    • 1998
  • We introduce a statistical scheme to classify seabed from acoustic profiling data acquired using Chirp sonar system. The classification is based on grouping of signal traces by similarity index, which is computed using the K-L (Karhunen-Lo$\grave{e}$ve) transform of the Chirp profiling data. The similarity index represents the degree of coherence of bottom-reflected signals in consecutive traces, hence indicating the acoustic roughness of the seabed. The results of this study show that similarity index is a function of homogeneity, grain size of sediments and bottom hardness. The similarity index ranges from 0 to 1 for various types of seabed material. It increases in accordance with the homogeneity and softness of bottom sediments, whereas it is inversely proportional to the grain size of sediments. As a real data example, we classified the seabed off Cheju Island, Korea based on the similarity index and compared the result with side-scan sonar data and sediment samples. The comparison shows that the classification of seabed by the similarity index is in good agreement with the real sedimentary facies and can delineate acoustic response of the seabed in more detail. Therefore, this study presents an effective method for geoacoustic modeling to classify the seafloor directly from acoustic data.

  • PDF

A study on the ordering of similarity measures with negative matches (음의 일치 빈도를 고려한 유사성 측도의 대소 관계 규명에 관한 연구)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.1
    • /
    • pp.89-99
    • /
    • 2015
  • The World Economic Forum and the Korean Ministry of Knowledge Economy have selected big data as one of the top 10 in core information technology. The key of big data is to analyze effectively the properties that do have data. Clustering analysis method of big data techniques is a method of assigning a set of objects into the clusters so that the objects in the same cluster are more similar to each other clusters. Similarity measures being used in the cluster analysis may be classified into various types depending on the nature of the data. In this paper, we studied upper and lower bounds for binary similarity measures with negative matches such as Russel and Rao measure, simple matching measure by Sokal and Michener, Rogers and Tanimoto measure, Sokal and Sneath measure, Hamann measure, and Baroni-Urbani and Buser mesures I, II. And the comparative studies with these measures were shown by real data and simulated experiment.

A Study on Detecting Changes in Injection Molding Process through Similarity Analysis of Mold Vibration Signal Patterns (금형 기반 진동 신호 패턴의 유사도 분석을 통한 사출성형공정 변화 감지에 대한 연구)

  • Jong-Sun Kim
    • Design & Manufacturing
    • /
    • v.17 no.3
    • /
    • pp.34-40
    • /
    • 2023
  • In this study, real-time collection of mold vibration signals during injection molding processes was achieved through IoT devices installed on the mold surface. To analyze changes in the collected vibration signals, injection molding was performed under six different process conditions. Analysis of the mold vibration signals according to process conditions revealed distinct trends and patterns. Based on this result, cosine similarity was applied to compare pattern changes in the mold vibration signals. The similarity in time and acceleration vector space between the collected data was analyzed. The results showed that under identical conditions for all six process settings, the cosine similarity remained around 0.92±0.07. However, when different process conditions were applied, the cosine similarity decreased to the range of 0.47±0.07. Based on these results, a cosine similarity threshold of 0.60~0.70 was established. When applied to the analysis of mold vibration signals, it was possible to determine whether the molding process was stable or whether variations had occurred due to changes in process conditions. This establishes the potential use of cosine similarity based on mold vibration signals in future applications for real-time monitoring of molding process changes and anomaly detection.

SVM based Clustering Technique for Processing High Dimensional Data (고차원 데이터 처리를 위한 SVM기반의 클러스터링 기법)

  • Kim, Man-Sun;Lee, Sang-Yong
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.14 no.7
    • /
    • pp.816-820
    • /
    • 2004
  • Clustering is a process of dividing similar data objects in data set into clusters and acquiring meaningful information in the data. The main issues related to clustering are the effective clustering of high dimensional data and optimization. This study proposed a method of measuring similarity based on SVM and a new method of calculating the number of clusters in an efficient way. The high dimensional data are mapped to Feature Space ones using kernel functions and then similarity between neighboring clusters is measured. As for created clusters, the desired number of clusters can be got using the value of similarity measured and the value of Δd. In order to verify the proposed methods, the author used data of six UCI Machine Learning Repositories and obtained the presented number of clusters as well as improved cohesiveness compared to the results of previous researches.

Learning Similarity with Probabilistic Latent Semantic Analysis for Image Retrieval

  • Li, Xiong;Lv, Qi;Huang, Wenting
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.4
    • /
    • pp.1424-1440
    • /
    • 2015
  • It is a challenging problem to search the intended images from a large number of candidates. Content based image retrieval (CBIR) is the most promising way to tackle this problem, where the most important topic is to measure the similarity of images so as to cover the variance of shape, color, pose, illumination etc. While previous works made significant progresses, their adaption ability to dataset is not fully explored. In this paper, we propose a similarity learning method on the basis of probabilistic generative model, i.e., probabilistic latent semantic analysis (PLSA). It first derives Fisher kernel, a function over the parameters and variables, based on PLSA. Then, the parameters are determined through simultaneously maximizing the log likelihood function of PLSA and the retrieval performance over the training dataset. The main advantages of this work are twofold: (1) deriving similarity measure based on PLSA which fully exploits the data distribution and Bayes inference; (2) learning model parameters by maximizing the fitting of model to data and the retrieval performance simultaneously. The proposed method (PLSA-FK) is empirically evaluated over three datasets, and the results exhibit promising performance.

Bandwidth Allocation for Self-Similar Data Traffic Characteristics (자기유사적인 데이터 트래픽 특성을 고려한 대역폭 할당)

  • Lim Seog-Ku
    • The Journal of the Korea Contents Association
    • /
    • v.5 no.3
    • /
    • pp.175-181
    • /
    • 2005
  • Recent measurements of local-area and wide-area traffic have shown that network traffic exhibits at a wide range of scales-Self-similarity. Self-similarity is expressed by long term dependency, this is contradictory concept with Poisson model that have relativity short term dependency. Therefore, first of all for design and dimensioning of next generation communication network, traffic model that are reflected burstness and self-similarity is required. Here self-similarity can be characterized by Hurst parameter. In this paper, when different many data traffic being integrated under various environments is arrived to communication network, Hurst Parameter's change is analyzed and compared with simulation results.

  • PDF

An Experimental Study on the Degree of Phonetic Similarity between Korean and Japanese Vowels (한국어와 일본어 단모음의 유사성 분석을 위한 실험음성학적 연구)

  • Kwon, Sung-Mi
    • MALSORI
    • /
    • no.63
    • /
    • pp.47-66
    • /
    • 2007
  • This study aims at exploring the degree of phonetic similarity between Korean and Japanese vowels in terms of acoustic features by performing the speech production test on Korean speakers and Japanese speakers. For this purpose, the speech of 16 Japanese speakers for Japanese speech data, and the speech of 16 Korean speakers for Korean speech data were utilized. The findings in assessing the degree of the similarity of the 7 nearest equivalents of the Korean and Japanese vowels are as follows: First, Korean /i/ and /e/ turned out to display no significant differences in terms of F1 and F2 with their counterparts, Japanese /i/ and /e/, and the distribution of F1 and F2 of Korean /i/ and /e/ in the distributional map completely overlapped with Japanese /i/ and /e/. Accordingly, Korean /i/ and /e/ were believed to be "identical." Second, Korean /a/, /o/, and /i/ displayed a significant difference in either F1 or F2, but showed a great similarity in distribution of F1 and F2 with Japanese /a/, /o/, and /m/ respectively. Korean /a/ /o/, and /i/, therefore, were categorized as very similar to Japanese vowels. Third, Korean /u/, which has the counterpart /m/ in Japanese, showed a significant difference in both F1 and F2, and only half of the distribution overlapped. Thus, Korean /u/ was analyzed as being a moderately similar vowel to Japanese vowels. Fourth, Korean /${\wedge}$/ did not have a close counterpart in Japanese, and was classified as "the least similar vowel."

  • PDF