• 제목/요약/키워드: similarity function

검색결과 548건 처리시간 0.078초

데이터베이스에서 유사도 질의 처리 비용 감소 방법 (A Method of Reducing the Processing Cost of Similarity Queries in Databases)

  • 김선경;박지수;손진곤
    • 정보처리학회논문지:소프트웨어 및 데이터공학
    • /
    • 제11권4호
    • /
    • pp.157-162
    • /
    • 2022
  • 오늘날 대부분의 데이터는 데이터베이스(database: DB)에 저장된다. 이러한 DB 환경에서 사용자는 자신이 원하는 데이터를 찾아줄 것을 DB에게 요청하게 된다. DB 질의 중 유사도 질의는 DB 사용자가 원하는 조건으로 유사도가 포함되어 있는 것을 말한다. 그러나 유사도 질의를 처리하기 위한 과정은 처리 레코드의 범위를 줄일 수 있는 색인을 이용하기 힘들어 테이블의 전체 레코드에 대해서 매번 유사도를 계산하는 비용이 높다. 본 논문은 이러한 문제점을 해결하기 위하여 경량 유사도 함수를 정의한다. 경량 유사도 함수는 유사도 함수에 비해 데이터를 여과하는 정확도는 떨어지지만 비용이 유사도 함수에 비하여 적게 소모되는 특징이 있다. 이러한 경량 유사도 함수의 특징을 이용하여 유사도 질의 처리 비용 감소 방법을 제시한다. 그리고 유클리드 거리 함수에 경량 유사도 함수로 체비쇼프 거리를 제시하고 기존의 유사도 함수를 이용하는 질의와 경량 유사도 함수를 이용하는 질의의 처리 비용을 비교한다. 그리고 실험을 통하여 유클리드 유사도에 대한 경량 유사도 함수로 체비쇼프 거리를 적용하였을 때 유사도 질의 처리 비용이 감소하는 것을 확인한다.

비 컨벡스 퍼지 소속함수에 대한 유사측도구성 (Similarity Measure Construction for Non-Convex Fuzzy Membership Function)

  • Park, Hyun-Jeong;Kim, Sung-Shin;Lee, Sang-H
    • 한국지능시스템학회:학술대회논문집
    • /
    • 한국지능시스템학회 2007년도 추계학술대회 학술발표 논문집
    • /
    • pp.199-202
    • /
    • 2007
  • The similarity measure is constructed for non-convex fuzzy membership function using well known Hamming distance measure. Comparison with convex fuzzy membership function is carried out, furthermore characteristic analysis for non-convex function are also illustrated. Proposed similarity measure is proved and the usefulness is verified through example. In example, usefulness of proposed similarity is pointed out.

  • PDF

Similarity Measure Construction for Non-Convex Fuzzy Membership Function

  • 박현정;김성신;이상혁
    • 한국지능시스템학회논문지
    • /
    • 제18권1호
    • /
    • pp.145-149
    • /
    • 2008
  • The similarity measure is constructed for non-convex fuzzy membership function using well known Hamming distance measure. Comparison with convex fuzzy membership function is carried out, furthermore characteristic analysis for non-convex function are also illustrated. Proposed similarity measure is proved and the usefulness is verified through example. In example, usefulness of proposed similarity is pointed out.

Continuous Conditional Random Field에 의한 인터넷 쇼핑몰 신규 고객등급 예측 (Prediction of New Customer's Degree of Loyalty of Internet Shopping Mall Using Continuous Conditional Random Field)

  • 안길승;허선
    • 대한산업공학회지
    • /
    • 제41권1호
    • /
    • pp.10-16
    • /
    • 2015
  • In this study, we suggest a method to predict probability distribution of a new customer's degree of loyalty using C-CRF that reflects the RFM score and similarity to the neighbors of the customer. An RFM score prediction model is introduced to construct the first feature function of C-CRF. Integrating demographical similarity, purchasing characteristic similarity and purchase history similarity, we make a unified similarity variable to configure the second feature function of C-CRF. Then parameters of each feature function are estimated and we train our C-CRF model by training data set and suggest a probabilistic distribution to estimate a new customer's degree of loyalty. An example is provided to illustrate our model.

SSF: Sentence Similar Function Based on word2vector Similar Elements

  • Yuan, Xinpan;Wang, Songlin;Wan, Lanjun;Zhang, Chengyuan
    • Journal of Information Processing Systems
    • /
    • 제15권6호
    • /
    • pp.1503-1516
    • /
    • 2019
  • In this paper, to improve the accuracy of long sentence similarity calculation, we proposed a sentence similarity calculation method based on a system similarity function. The algorithm uses word2vector as the system elements to calculate the sentence similarity. The higher accuracy of our algorithm is derived from two characteristics: one is the negative effect of penalty item, and the other is that sentence similar function (SSF) based on word2vector similar elements doesn't satisfy the exchange rule. In later studies, we found the time complexity of our algorithm depends on the process of calculating similar elements, so we build an index of potentially similar elements when training the word vector process. Finally, the experimental results show that our algorithm has higher accuracy than the word mover's distance (WMD), and has the least query time of three calculation methods of SSF.

APPLICATIONS OF SIMILARITY MEASURES FOR PYTHAGOREAN FUZZY SETS BASED ON SINE FUNCTION IN DECISION-MAKING PROBLEMS

  • ARORA, H.D.;NAITHANI, ANJALI
    • Journal of applied mathematics & informatics
    • /
    • 제40권5_6호
    • /
    • pp.897-914
    • /
    • 2022
  • Pythagorean fuzzy sets (PFSs) are capable of modelling information with more uncertainties in decision-making problems. The essential feature of PFSs is that they are described by three parameters: membership function, non-membership function and hesitant margin, with the total of the squares of each parameter equal to one. The purpose of this article is to suggest some new similarity measures and weighted similarity measures for PFSs. Numerical computations have been carried out to validate our proposed measures. Applications of these measures have been applied to some real-life decision-making problems of pattern detection and medicinal investigations. Moreover, a descriptive illustration is employed to compare the results of the proposed measures with the existing analogous similarity measures to show their effectiveness.

Cross-architecture Binary Function Similarity Detection based on Composite Feature Model

  • Xiaonan Li;Guimin Zhang;Qingbao Li;Ping Zhang;Zhifeng Chen;Jinjin Liu;Shudan Yue
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제17권8호
    • /
    • pp.2101-2123
    • /
    • 2023
  • Recent studies have shown that the neural network-based binary code similarity detection technology performs well in vulnerability mining, plagiarism detection, and malicious code analysis. However, existing cross-architecture methods still suffer from insufficient feature characterization and low discrimination accuracy. To address these issues, this paper proposes a cross-architecture binary function similarity detection method based on composite feature model (SDCFM). Firstly, the binary function is converted into vector representation according to the proposed composite feature model, which is composed of instruction statistical features, control flow graph structural features, and application program interface calling behavioral features. Then, the composite features are embedded by the proposed hierarchical embedding network based on a graph neural network. In which, the block-level features and the function-level features are processed separately and finally fused into the embedding. In addition, to make the trained model more accurate and stable, our method utilizes the embeddings of predecessor nodes to modify the node embedding in the iterative updating process of the graph neural network. To assess the effectiveness of composite feature model, we contrast SDCFM with the state of art method on benchmark datasets. The experimental results show that SDCFM has good performance both on the area under the curve in the binary function similarity detection task and the vulnerable candidate function ranking in vulnerability search task.

서로 다른 버전의 동일 오픈소스 함수 간 효율적인 유사도 분석 기법 (Efficient Similarity Analysis Methods for Same Open Source Functions in Different Versions)

  • 김영철;조은선
    • 정보과학회 논문지
    • /
    • 제44권10호
    • /
    • pp.1019-1025
    • /
    • 2017
  • 바이너리 유사도 분석은 취약점 분석, 악성코드 분석, 표절 탐지 등에서 사용되고 있는데, 분석대상 함수가 알려진 안전한 함수와 동일하다는 것을 증명해주면 바이너리 코드의 악성행위 분석, 취약점 분석 등의 효율성을 높이는 데에 도움이 될 수 있다. 하지만 기존에는 동일 함수의 서로 다른 버전에 대한 유사도 분석에 대해서 별도로 이루어진 연구가 거의 없었다. 본 논문에서는 바이너리로부터 추출 가능한 함수 정보들을 바탕으로 다양한 방법을 통해 함수 단위의 유사도를 분석하고 적은 시간으로 효율적으로 분석할 수 있는 방안을 모색한다. 특히 OpenSSL 라이브러리의 서로 다른 버전을 대상으로 분석을 수행하여 버전이 다른 경우에도 유사한 함수를 탐지하는 것을 확인한다.

New Similarity Measures of Simplified Neutrosophic Sets and Their Applications

  • Liu, Chunfang
    • Journal of Information Processing Systems
    • /
    • 제14권3호
    • /
    • pp.790-800
    • /
    • 2018
  • The simplified neutrosophic set (SNS) is a generalization of fuzzy set that is designed for some practical situations in which each element has truth membership function, indeterminacy membership function and falsity membership function. In this paper, we propose a new method to construct similarity measures of single valued neutrosophic sets (SVNSs) and interval valued neutrosophic sets (IVNSs), respectively. Then we prove that the proposed formulas satisfy the axiomatic definition of the similarity measure. At last, we apply them to pattern recognition under the single valued neutrosophic environment and multi-criteria decision-making problems under the interval valued neutrosophic environment. The results show that our methods are effective and reasonable.

무 결합계수-회전변환의, 최적화된 유리함수 Fitting에 의한 효율적인 RF대역 여파기 설계기법 (An Efficient Design Method of RF Filters via Optimized Rational-Function Fitting, without Coupling-Coefficient Similarity Transformation)

  • 주정호;강승택;김형석
    • 한국정보통신설비학회:학술대회논문집
    • /
    • 한국정보통신설비학회 2006년도 하계학술대회
    • /
    • pp.202-204
    • /
    • 2006
  • A new method is presented to design RF filters without the Similarity Transform of their coupling coefficient matrix as circuit parameters which is very tedious due to pivoting and deciding rotation angles needed during the iterations. The transfer function of a filter is directly used for the design and its desired form is derived by the optimized rational-function fitting technique. A 3rd order Coaxial Lowpass filter and an 8th order dual-mode elliptic integral function response filter are taken as an example to validate the proposed method.

  • PDF