• 제목/요약/키워드: similarity ranking

검색결과 76건 처리시간 0.022초

유사도 알고리즘을 활용한 시맨틱 프로세스 검색방안 (Semantic Process Retrieval with Similarity Algorithms)

  • 이홍주
    • Asia pacific journal of information systems
    • /
    • 제18권1호
    • /
    • pp.79-96
    • /
    • 2008
  • One of the roles of the Semantic Web services is to execute dynamic intra-organizational services including the integration and interoperation of business processes. Since different organizations design their processes differently, the retrieval of similar semantic business processes is necessary in order to support inter-organizational collaborations. Most approaches for finding services that have certain features and support certain business processes have relied on some type of logical reasoning and exact matching. This paper presents our approach of using imprecise matching for expanding results from an exact matching engine to query the OWL(Web Ontology Language) MIT Process Handbook. MIT Process Handbook is an electronic repository of best-practice business processes. The Handbook is intended to help people: (1) redesigning organizational processes, (2) inventing new processes, and (3) sharing ideas about organizational practices. In order to use the MIT Process Handbook for process retrieval experiments, we had to export it into an OWL-based format. We model the Process Handbook meta-model in OWL and export the processes in the Handbook as instances of the meta-model. Next, we need to find a sizable number of queries and their corresponding correct answers in the Process Handbook. Many previous studies devised artificial dataset composed of randomly generated numbers without real meaning and used subjective ratings for correct answers and similarity values between processes. To generate a semantic-preserving test data set, we create 20 variants for each target process that are syntactically different but semantically equivalent using mutation operators. These variants represent the correct answers of the target process. We devise diverse similarity algorithms based on values of process attributes and structures of business processes. We use simple similarity algorithms for text retrieval such as TF-IDF and Levenshtein edit distance to devise our approaches, and utilize tree edit distance measure because semantic processes are appeared to have a graph structure. Also, we design similarity algorithms considering similarity of process structure such as part process, goal, and exception. Since we can identify relationships between semantic process and its subcomponents, this information can be utilized for calculating similarities between processes. Dice's coefficient and Jaccard similarity measures are utilized to calculate portion of overlaps between processes in diverse ways. We perform retrieval experiments to compare the performance of the devised similarity algorithms. We measure the retrieval performance in terms of precision, recall and F measure? the harmonic mean of precision and recall. The tree edit distance shows the poorest performance in terms of all measures. TF-IDF and the method incorporating TF-IDF measure and Levenshtein edit distance show better performances than other devised methods. These two measures are focused on similarity between name and descriptions of process. In addition, we calculate rank correlation coefficient, Kendall's tau b, between the number of process mutations and ranking of similarity values among the mutation sets. In this experiment, similarity measures based on process structure, such as Dice's, Jaccard, and derivatives of these measures, show greater coefficient than measures based on values of process attributes. However, the Lev-TFIDF-JaccardAll measure considering process structure and attributes' values together shows reasonably better performances in these two experiments. For retrieving semantic process, we can think that it's better to consider diverse aspects of process similarity such as process structure and values of process attributes. We generate semantic process data and its dataset for retrieval experiment from MIT Process Handbook repository. We suggest imprecise query algorithms that expand retrieval results from exact matching engine such as SPARQL, and compare the retrieval performances of the similarity algorithms. For the limitations and future work, we need to perform experiments with other dataset from other domain. And, since there are many similarity values from diverse measures, we may find better ways to identify relevant processes by applying these values simultaneously.

클러스터링 기법을 이용한 키워드 유사도 순위화 알고리즘에 따른 사용자 질의 확장 (User Query Expansion Through Keyword Similarity Ranking Algorithm Us ins Cluster ing Methods)

  • 이상훈;김기태
    • 한국정보과학회:학술대회논문집
    • /
    • 한국정보과학회 2003년도 봄 학술발표논문집 Vol.30 No.1 (B)
    • /
    • pp.479-481
    • /
    • 2003
  • 본 논문에서는 여러 가지 클러스터링 기법들을 사용하여 키워드 유사도롤 순위화하여 사용자의 질의를 확장하는 기법을 제안한다. 클러스터링 기법에는 연관(Association) 클러스터링, 메트릭(Metric) 클러스터링, 스칼라(Scalar) 클러스터링 기법을 사용하고, 이들간의 가중치를 적절히 조절하여 검색 시스템을 만든다. 사용자의 질의가 주어졌을 때, 질의 키워드와 연관된 키워드들을 순위화 하여 사용자에게 보여주고, 사용자의 추가입력을 받아서 질의를 확장한다. 사용자가 적당한 질의어로 판단하여 확장된 질의로 검색을 수행할 때까지 이 과정을 반복한다. 실험에서 사용한 문헌집합은 Korea Herald의 2003년 1월과 2월의 경제 관련 기사들을 수집하여 사용하였고, 실험을 거쳐서 질의를 확장한 결과 만족할 만한 결과가 도출되었다.

  • PDF

A Fuzzy TOPSIS Approach Based on Trapezoidal Numbers to Material Selection Problem

  • Celik, Erkan;Gul, Muhammet;Gumus, Alev Taskin;Guneri, Ali Fuat
    • Journal of Information Technology Applications and Management
    • /
    • 제19권3호
    • /
    • pp.19-30
    • /
    • 2012
  • Material selection is a complex problem in the design and development of products for diverse engineering applications. This paper is aimed to present a fuzzy decision making approach to deal with the material selection in engineering design problems. A fuzzy multi criteria decision-making model is proposed for solving the material selection problem. The proposed model makes use of fuzzy TOPSIS (Technique for Order reference by Similarity to Ideal Solution) with trapezoidal numbers for evaluating the criteria and ranking the alternatives. And result is compared with fuzzy VIKOR (VlseKriterijumska Optimizacija I Kompromisno Resenje in Serbian, means Multi criteria Optimisation and Compromise Solution) which is proposed by Jeya Girubha and Vinodh [2012]. The present paper is aimed to also improve literature of fuzzy decision making for material selection problem.

고립 단어 인식 결과의 비유사 후보 단어 제외 성능을 개선하기 위한 다양한 접근 방법 연구 (Various Approaches to Improve Exclusion Performance of Non-similar Candidates from N-best Recognition Results on Isolated Word Recognition)

  • 윤영선
    • 말소리와 음성과학
    • /
    • 제2권4호
    • /
    • pp.153-161
    • /
    • 2010
  • Many isolated word recognition systems may generate non-similar words for recognition candidates because they use only acoustic information. The previous study [1,2] investigated several techniques which can exclude non-similar words from N-best candidate words by applying Levenstein distance measure. This paper discusses the various improving techniques of removing non-similar recognition results. The mentioned methods include comparison penalties or weights, phone accuracy based on confusion information, weights candidates by ranking order and partial comparisons. Through experimental results, it is found that some proposed method keeps more accurate recognition results than the previous method's results.

  • PDF

Study of the effect of varying shapes of holes in energy absorption characteristics on aluminium circular windowed tubes under quasi-static loading

  • Baaskaran, N;Ponappa, K;Shankar, S
    • Structural Engineering and Mechanics
    • /
    • 제70권2호
    • /
    • pp.153-168
    • /
    • 2019
  • In this paper, energy absorption characteristics of circular windowed tubes with different section shapes (circular, ellipse, square, hexagon, polygon and pentagon) are investigated numerically and experimentally. The tube possesses the same material, thickness, height, volume and average cross sectional area which are subjected under axial and oblique quasi-static loading conditions. Numerical model was constructed with FE code ABAQUS/Explicit, the obtained outcome of simulation is in good matching with the experimental data. The energy absorbed, specific energy absorption, crash force efficiency, peak and mean loads along with the collapse modes with their initiation point of simple and windowed tubes were evaluated. The technique for order of preference by similarity ideal solution (TOPSIS) approach was employed for assessing their overall crushing performances. The obtained results confirm that efficacy of crash force indicators have improved by introducing windows and tubes with pentagonal and circular windows achieves the maximum ranking about 0.528 and 0.517, it clearly reveals the above are best window shapes.

직렬저항/퍼지로직 기반 배터리 선별 그룹 내 정량적 우선순위 비교 및 선정 (Comparison and Selection of Quantitative Priority in Battery Screening Group Based on Series Resistance/Fuzzy Logic)

  • 조상우;한동호;최창기;김재원;김종훈
    • 전력전자학회논문지
    • /
    • 제27권2호
    • /
    • pp.142-149
    • /
    • 2022
  • In increasing the safety and usage of lithium-ion battery packs, reducing the parameter deviation between cells, such as voltage and temperature within the battery pack, is important. In this study, we propose a screening method to reduce parameter deviation between cells in battery packs. Screening algorithms are constructed based on Fuzzy logic and quantitatively express the similarities between battery cells. Screening is applied by utilizing series resistance components after experiments of electrical characteristics that consider the operation status of battery packs. After screening, the standard deviation of series resistance components according to the similarity ranking is compared and analyzed, and their conformity are verified with the algorithm parameters.

문장-질의 유사성을 이용한 웹 정보 검색의 성능 향상 (Performance Improvement of Web Information Retrieval Using Sentence-Query Similarity)

  • 박의규;나동열;장명길
    • 한국정보과학회논문지:소프트웨어및응용
    • /
    • 제32권5호
    • /
    • pp.406-415
    • /
    • 2005
  • 인터넷의 발전으로 웹 상에 수많은 문서 및 정보가 존재하는 상황에서 사용자가 원하는 정보를 담은 웹 문서를 검색하여 주는 웹 정보 검색 기술은 매우 중요하게 되었다. 본 논문에서는 웹 정보 검색 시스템의 성능 향상에 효과적인 몇 가지 주요한 기술을 제안하였다. 기존 시스템들은 주로 문서와 질의의 유사도를 계산하여 이를 주요 정보로 이용하였다. 그러나 본 논문에서는 여기에서 한 걸음 더 나아가 문서 안의 각 문장들이 질의와 얼마나 유사한가를 계산하여 이를 이용하는 기법을 제안하였다. 이러한 문장-질의 유사도를 성숙된 자연어 처리 기술 없이 근사적으로 계산하는 방법을 소개하였다. 그리고 이계산 작업은 문서 수의 증가에 선형적인 계산량의 증가를 가져 옴을 보임으로써 실용적인 대용량 시스템에서도 사용할 수 있음을 보였다. 그 다음으로 제안된 주요한 기술은 출력 문서의 순위화에 계층적인 개념을 도입하는 것이다. 이 기법을 사용함으로써 상당한 성능 향상을 이룰 수 있음을 보였다. 그 외에도 웹 문서의 특징인 하이퍼 링크 정보와 타이틀 정보를 이용하여 어느 정도의 성능 개선을 가져올 수 있음을 보였다. 이러한 기술들의 타당성을 입증하기 위해 대용량 웹 정보검색 시스템을 개발하고 실험하였다.

영어권, 중국어권 학습자의 한국어 모음 지각 -모국어와 목표 언어 간의 음향 자질의 유사성과 한국어 경험의 효과 중심으로- (Perception of Korean Vowels by English and Mandarin Learners of Korean: Effects of Acoustic Similarity Between L1 and L2 Sounds and L2 Experience)

  • 류나영
    • 한국어교육
    • /
    • 제29권1호
    • /
    • pp.1-23
    • /
    • 2018
  • This paper investigates how adult Mandarin- and English- speaking learners of Korean perceive Korean vowels, with focus on the effect of the first language (L1) and the second language (L2) acoustic relationship, as well as the influence of Korean language experience. For this study, native Mandarin and Canadian English speakers who have learned Korean as a foreign language, as well as a control group of native Korean speakers, participated in two experiments. Experiment 1 was designed to examine acoustic similarities between Korean and English vowels, as well as Korean and Mandarin vowels to predict which Korean vowels are relatively easy, or difficult for L2 learners to perceive. The linear discriminant analysis (Klecka, 1980) based on their L1-L2 acoustic similarity predicted that L2 Mandarin learners would have perceptual difficulty rankings for Korean vowels as follows: (the easiest) /i, a, e/ >> /ɨ, ʌ, o, u/ (most difficult), whereas L2 English learners would have perceptual difficulty rankings for Korean vowels as follows: (the easiest) /i, a, e, ɨ, ʌ/ >> /o, u/ (most difficult). The goal of Experiment 2 was to test how accurately L2 Mandarin and English learners perceive Korean vowels /ɨ, ʌ, o, u/ which are considered to be difficult for L2 learners. The results of a mixed-effects logistic model revealed that English listeners showed higher identification accuracy for Korean vowels than Mandarin listeners, indicating that having a larger L1 vowel inventory than the L2 facilitates L2 vowel perception. However, both groups have the same ranking of Korean vowel perceptual difficulty: ɨ > ʌ > u > o. This finding indicates that adult learners of Korean can perceive the new vowel /ɨ/, which does not exist in their L1, more accurately than the vowel /o/, which is acoustically similar to vowels in their L1, suggesting that L2 learners are more likely to establish additional phonetic categories for new vowels. In terms of the influence of experience with L2, it was found that identification accuracy increases as Korean language experience rises. In other words, the more experienced English and Mandarin learners of Korean are, the more likely they are to have better identification accuracy in Korean vowels than less experienced learners of Korean. Moreover, there is no interaction between L1 background and L2 experience, showing that identification accuracy of Korean vowels is higher as Korean language experience increases regardless of their L1 background. Overall, these findings of the two experiments demonstrated that acoustic similarity between L1 and L2 sounds using the LDA model can partially predict perceptual difficulty in L2 acquisition, indicating that other factors such as perceptual similarity between L1 and L2, the merge of Korean /o/ and /u/ may also influence their Korean vowel perception.

질의응답시스템 응답순위 개선을 위한 새로운 유사도 계산방법 (A New Similarity Measure for Improving Ranking in QA Systems)

  • 김명관;박영택
    • 한국정보과학회논문지:컴퓨팅의 실제 및 레터
    • /
    • 제10권6호
    • /
    • pp.529-536
    • /
    • 2004
  • 본 논문에서는 질의응답시스템의 성능을 개선하기 위해 문장의 위치정보와 질의형태분류기를 사용하여 질의에 대한 대답순위를 조정하는 새로운 질의-문서 유사도 계산을 제안한다. 이를 위해 첫째로 문서내용을 표현하고 문서의 위치정보를 반영하기 위해 개념그래프를 사용한다. 이 방법은 문서비교에 대표적으로 사용되는 Dice-Coefficient에 기반하고 문장에서 단어의 위치정보론 반영한 유사도 계산이다. 두번째로 질의응답시스템의 대답순위를 개선하기 위하여 질의형태를 고려한 기계학습을 통한 질문에 대한 분류를 하였으며 이를 위해서 뉴스그룹의 FAQ 문서 30,000개를 가지고 기계학습 방법인 나이브 베이지안을 사용한 분류기를 구현하였다. 이에 대한 평가를 위해 세계적인 정보검색대회인 TREC-9의 질의응답시스템분야에 제출된 데이타를 가지고 실험하였으며 기존의 방법에 비해 자동학습기법을 사용하였음에도 평균상호순위가 0.29, 상위 5위에 정답을 포함시킨 경우가 55.1%의 성능을 보였다. 이 방법은 다른 시스템과 달리 질의형태분류를 기계학습 방법을 사용하여 자동으로 학습하는 것에 의의를 갖는다.

3차원 조형장비 선정을 위한 복합 다요소 의사결정 구조 모델 개발에 관한 연구 (A decision making framework model for the selection of a RP using hybrid multiple attribute decision making techniques)

  • 변홍석
    • 한국기계가공학회지
    • /
    • 제7권3호
    • /
    • pp.87-95
    • /
    • 2008
  • The purpose of this study is to provide a decision support to select an appropriate rapid prototyping(RP) machine that suits the application of a part. Selection factors include concept model, form/fit/functional model, pattern model for molding, material property, build time and part cost that greatly affect the performance of RP machines. However, the selection of a RP is not an easy decision because they are uncertain and vague. For this reason, the aim of this research is to propose hybrid multiple attribute decision making approaches to effectively evaluate RP machines. In addition, because subjective considerations are relevant to selection decision, a fuzzy logic approach is adopted. The proposed selection procedure consists of several steps. First, we identify RP machines that the users consider. After constructing the evaluation criteria, we calculate the weights of the criteria by applying the fuzzy Analytic Hierarchy Process(AHP) method. Finally, we construct the fuzzy Technique of Order Preference by Similarity to Ideal Solution(TOPSIS) method to achieve the ranking order of all machines providing the decision information for the selection of RP machines.

  • PDF