• Title/Summary/Keyword: Similarity Test

Search Result 720, Processing Time 0.031 seconds

Isomer Differentiation Using in silico MS2 Spectra. A Case Study for the CFM-ID Mass Spectrum Predictor

  • Milman, Boris L.;Ostrovidova, Ekaterina V.;Zhurkovich, Inna K.
    • Mass Spectrometry Letters
    • /
    • v.10 no.3
    • /
    • pp.93-101
    • /
    • 2019
  • Algorithms and software for predicting tandem mass spectra have been developed in recent years. In this work, we explore how distinct in silico $MS^2$ spectra are predicted for isomers, i.e. compounds having the same formula and similar molecular structures, to differentiate between them. We used the CFM-ID 2.0/3.0 predictor with regard to (a) test compounds, whose experimental mass spectra had been randomly sampled from the MassBank of North America (MoNA) collection, and to (b) the most widespread isomers of test compounds searched in the PubChem database. In the first validation test, in silico mass spectra constitute a reference library, and library searches are performed for test experimental spectra of "unknowns". The searches led to the true positive rate (TPR) of ($46-48{\pm}10$)%. In the second test, in silico and experimental spectra were interchanged and this resulted in a TPR of ($58{\pm}10$)%. There were no significant differences between results obtained with different metrics of spectral similarity and predictor versions. In a comparison of test compounds vs. their isomers, a statistically significant correlation between mass spectral data and structural features was observed. The TPR values obtained should be regarded as reasonable results for predicting tandem mass spectra of related chemical structures.

Improving the Performance of Document Clustering with Distributional Similarities (분포유사도를 이용한 문헌클러스터링의 성능향상에 대한 연구)

  • Lee, Jae-Yun
    • Journal of the Korean Society for information Management
    • /
    • v.24 no.4
    • /
    • pp.267-283
    • /
    • 2007
  • In this study, measures of distributional similarity such as KL-divergence are applied to cluster documents instead of traditional cosine measure, which is the most prevalent vector similarity measure for document clustering. Three variations of KL-divergence are investigated; Jansen-Shannon divergence, symmetric skew divergence, and minimum skew divergence. In order to verify the contribution of distributional similarities to document clustering, two experiments are designed and carried out on three test collections. In the first experiment the clustering performances of the three divergence measures are compared to that of cosine measure. The result showed that minimum skew divergence outperformed the other divergence measures as well as cosine measure. In the second experiment second-order distributional similarities are calculated with Pearson correlation coefficient from the first-order similarity matrixes. From the result of the second experiment, secondorder distributional similarities were found to improve the overall performance of document clustering. These results suggest that minimum skew divergence must be selected as document vector similarity measure when considering both time and accuracy, and second-order similarity is a good choice for considering clustering accuracy only.

Software Similarity Detection Using Highly Credible Dynamic API Sequences (신뢰성 높은 동적 API 시퀀스를 이용한 소프트웨어 유사성 검사)

  • Park, Seongsoo;Han, Hwansoo
    • Journal of KIISE
    • /
    • v.43 no.10
    • /
    • pp.1067-1072
    • /
    • 2016
  • Software birthmarks, which are unique characteristics of the software, are used to detect software plagiarism or software similarity. Generally, software birthmarks are divided into static birthmarks or dynamic birthmarks, which have evident pros and cons depending on the extraction method. In this paper, we propose a method for extracting the API sequence birthmarks using a dynamic analysis and similarity detection between the executable codes. Dynamic birthmarks based on API sequences extract API functions during the execution of programs. The extracted API sequences often include all the API functions called from the start to the end of the program. Meanwhile, our dynamic birthmark scheme extracts the API functions only called directly from the executable code. Then, it uses a sequence alignment algorithm to calculate the similarity metric effectively. We evaluate the birthmark with several open source software programs to verify its reliability and credibility. Our dynamic birthmark scheme based on the extracted API sequence can be utilized in a similarity test of executable codes.

Effects of Temporal Distance on Brand Extension Evaluation: Applying the Construal-Level Perspective to Brand Extensions

  • Park, Kiwan
    • Asia Marketing Journal
    • /
    • v.17 no.1
    • /
    • pp.97-121
    • /
    • 2015
  • In this research, we examine whether and why temporal distance influences evaluations of two different types of brand extensions: concept-based extensions, defined as extensions primarily based on the importance or relevance of brand concepts to extension products; and similarity-based extensions, defined as extensions primarily based on the amount of feature similarity at the product-category level. In Study 1, we test the hypothesis that concept-based extensions are evaluated more favorably when they are framed to launch in the distant rather than in the near future, whereas similaritybased extensions are evaluated more favorably when they are framed to launch in the near rather than in the distant future. In Study 2, we confirm that this time-dependent differential evaluation is driven by the difference in construal level between the bases of the two types of extensions - i.e., brand-concept consistency and product-category feature similarity. As such, we find that conceptbased extensions are evaluated more favorably under the abstract than concrete mindset, whereas similarity-based extensions are evaluated more favorably under the concrete than abstract mindset. In Study 3, we extend to the case for a broad brand (i.e., brands that market products across multiple categories), finding that making accessible a specific product category of a broad parent brand influences evaluations of near-future, but not distant-future, brand extensions. Combined together, our findings suggest that temporal distance influences brand extension evaluation through its effect on the importance placed on brand concepts and feature similarity. That is, consumers rely on different bases to evaluate brand extensions, depending on their perception of when the extensions take place and on under what mindset they are placed. This research makes theoretical contributions to the brand extension research by identifying one important determinant to brand extension evaluation and also uncovering its underlying dynamics. It also contributes to expanding the scope of the construal level theory by putting forth a novel interpretation of two bases of perceived fit in terms of construal level. Marketers who are about to launch and advertise brand extensions may benefit by considering temporal-distance information in determining what content to deliver about extensions in their communication efforts. Conceptual relation of a parent brand to extensions needs to be emphasized in the distant future, whereas feature similarity should be highlighted in the near future.

Structural Similarity Based Video Quality Metric using Human Visual System (구조적 유사도 기반의 인간의 시각적 특성을 이용한 비디오 품질 측정 기준)

  • Park, Jin-Cheol;Lee, Sang-Hoon
    • Journal of Broadcast Engineering
    • /
    • v.14 no.1
    • /
    • pp.36-43
    • /
    • 2009
  • Recently, the structural similarity (SSIM) index metric is proposed. In the present paper, a new framework, which is called visual SSIM (VSSIM), is proposed by incorporating crucial human factors into the SSIM. The human factors are foveation, luminance, frequency and motion information. The performance of VSSIM is evaluated by subjective quality test compliant with the Video Quality Expert Group (VQEG) multimedia group test plan. It shows that the visual SSIM is more correlated with the subjective quality result than the conventional SSIM.

A Study on Prescription Similarity Analysis for Efficiency Improvement (처방 유사도 분석의 효율성 향상에 관한 연구)

  • Hwang, SuKyung;Woo, DongHyeon;Kim, KiWook;Lee, ByungWook
    • Journal of Korean Medical classics
    • /
    • v.35 no.4
    • /
    • pp.1-9
    • /
    • 2022
  • Objectives : This study aims to increase efficiency of the prescription similarity analysis method that uses drug composition ratio. Methods : The controlled experiment compared result generation time, generated data quantity, and accuracy of results between previous and new analysis method on the 12,598 formulas and 61 prescription groups. Results : The control group took 346 seconds on average and generated 768,478 results, while the test group took 24 seconds and generated 241,739 results. The test group adopted a selective calculation method that only used overlapping data between two formulas instead of analyzing all number of cases. It simplified the data processing process, reducing the quantity of data that is required to be processed, leading to better system speed, as fast as 14.47 times more than previous analysis method with equal results. Conclusions : Efficiency for similarity analysis could be improved by reducing data span and simplifying the calculation processes.

Adaptive State Feedback Control for Nonlinear Rotary Inverted Pendulum System using Similarity Transformation Method: Implementation of Real-Time Experiment (유사변환기법을 이용한 비선형 회전식 역진자의 적응형 상태궤환 제어시스템: 실시간 실험 구현)

  • Cho, Hyun-Cheol;Lee, Young-Jin;Lee, Kwon-Soon;Koo, Kyung-Wan
    • The Transactions of the Korean Institute of Electrical Engineers P
    • /
    • v.58 no.2
    • /
    • pp.130-135
    • /
    • 2009
  • In recent years, researches on rotary inverted pendulum control systems have been significantly focused due their highly nonlinear dynamics and complicated geometric structures. This paper presents a novel control approach for such systems by means of similarity transformation theory. At first, we represent nonlinear system dynamics to the controllability-formed state space model including a time-varying parameter vector. We establish the state-feedback control configuration based on the transformed model and derive an adaptive control law for adjusting desired characteristic equation. Numerical analysis is achieved to evaluate our control method and demonstrate its superiority by comparing it to the traditional control strategy. Furthermore, real-time control experiment is carried out to test its practical reliability.

Parameterization of Along-Wind Dispersion Coefficients based on Field and Wind Tunnel Data

  • Kang, Sung-Dae
    • Environmental Sciences Bulletin of The Korean Environmental Sciences Society
    • /
    • v.10 no.S_1
    • /
    • pp.11-22
    • /
    • 2001
  • Observations related to the along-wind dispersion of puffs were collected from 12 field sites and from a wind tunnel experiment and used to test simple similarity relations. Because most of the date made use of concentration time series observation from fixed monitors, the basic observation was t, the standard deviation of the concentration time series. This data also allowed the travel time, t, from the source to the receptor to be estimated, from which the puff advective speed ue, could be determined. The along-wind dispersion coefficient, x, was then assumed to equal tue. The data, which extended over four orders of magnitude, supported the similarity relations t=0.1 t and x=1.8 $u^*$t, where t is the travel time and $u^*$ is the friction velocity. About 50% of the observations were within a factor of two of the predictions based on the similarity relations.

  • PDF