• Title/Summary/Keyword: Semantic Similarity

Search Result 281, Processing Time 0.027 seconds

Video Data Modeling for Supporting Structural and Semantic Retrieval (구조 및 의미 검색을 지원하는 비디오 데이타의 모델링)

  • 복경수;유재수;조기형
    • Journal of KIISE:Databases
    • /
    • v.30 no.3
    • /
    • pp.237-251
    • /
    • 2003
  • In this paper, we propose a video retrieval system to search logical structure and semantic contents of video data efficiently. The proposed system employs a layered modelling method that orBanifes video data in raw data layer, content layer and key frame layer. The layered modelling of the proposed system represents logical structures and semantic contents of video data in content layer. Also, the proposed system supports various types of searches such as text search, visual feature based similarity search, spatio-temporal relationship based similarity search and semantic contents search.

Semantic-Based K-Means Clustering for Microblogs Exploiting Folksonomy

  • Heu, Jee-Uk
    • Journal of Information Processing Systems
    • /
    • v.14 no.6
    • /
    • pp.1438-1444
    • /
    • 2018
  • Recently, with the development of Internet technologies and propagation of smart devices, use of microblogs such as Facebook, Twitter, and Instagram has been rapidly increasing. Many users check for new information on microblogs because the content on their timelines is continually updating. Therefore, clustering algorithms are necessary to arrange the content of microblogs by grouping them for a user who wants to get the newest information. However, microblogs have word limits, and it has there is not enough information to analyze for content clustering. In this paper, we propose a semantic-based K-means clustering algorithm that not only measures the similarity between the data represented as a vector space model, but also measures the semantic similarity between the data by exploiting the TagCluster for clustering. Through the experimental results on the RepLab2013 Twitter dataset, we show the effectiveness of the semantic-based K-means clustering algorithm.

Korean Compound Noun Decomposition and Semantic Tagging System using User-Word Intelligent Network (U-WIN을 이용한 한국어 복합명사 분해 및 의미태깅 시스템)

  • Lee, Yong-Hoon;Ock, Cheol-Young;Lee, Eung-Bong
    • The KIPS Transactions:PartB
    • /
    • v.19B no.1
    • /
    • pp.63-76
    • /
    • 2012
  • We propose a Korean compound noun semantic tagging system using statistical compound noun decomposition and semantic relation information extracted from a lexical semantic network(U-WIN) and dictionary definitions. The system consists of three phases including compound noun decomposition, semantic constraint, and semantic tagging. In compound noun decomposition, best candidates are selected using noun location frequencies extracted from a Sejong corpus, and re-decomposes noun for semantic constraint and restores foreign nouns. The semantic constraints phase finds possible semantic combinations by using origin information in dictionary and Naive Bayes Classifier, in order to decrease the computation time and increase the accuracy of semantic tagging. The semantic tagging phase calculates the semantic similarity between decomposed nouns and decides the semantic tags. We have constructed 40,717 experimental compound nouns data set from Standard Korean Language Dictionary, which consists of more than 3 characters and is semantically tagged. From the experiments, the accuracy of compound noun decomposition is 99.26%, and the accuracy of semantic tagging is 95.38% respectively.

Conceptual Retrieval of Chinese Frequently Asked Healthcare Questions

  • Liu, Rey-Long;Lin, Shu-Ling
    • International Journal of Knowledge Content Development & Technology
    • /
    • v.5 no.1
    • /
    • pp.49-68
    • /
    • 2015
  • Given a query (a health question), retrieval of relevant frequently asked questions (FAQs) is essential as the FAQs provide both reliable and readable information to healthcare consumers. The retrieval requires the estimation of the semantic similarity between the query and each FAQ. The similarity estimation is challenging as semantic structures of Chinese healthcare FAQs are quite different from those of the FAQs in other domains. In this paper, we propose a conceptual model for Chinese healthcare FAQs, and based on the conceptual model, present a technique ECA that estimates conceptual similarities between FAQs. Empirical evaluation shows that ECA can help various kinds of retrievers to rank relevant FAQs significantly higher. We also make ECA online to provide services for FAQ retrievers.

Semantic Process Retrieval with Similarity Algorithms (유사도 알고리즘을 활용한 시맨틱 프로세스 검색방안)

  • Lee, Hong-Ju;Klein, Mark
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2007.11a
    • /
    • pp.267-272
    • /
    • 2007
  • One of the roles of the Semantic Web services is to execute dynamic intra-organizational services including the integration and interoperation of business processes. Since different organizations design their processes differently, the retrieval of similar semantic business processes is necessary in order to support inter-organizational collaborations. Most approaches for finding services that have certain features and support certain business processes have relied on some type of logical reasoning and exact matching. This paper presents our approach of using imprecise matching fur expanding results from an exact matching engine to query the OWL MIT Process Handbook. In order to use the MIT Process Handbook for process retrieval experiments, we had to export it into an OWL-based format. We model the Process Handbook meta-model in OWL and export the processes in the Handbook as instances of the meta-model. Next, we need to find a sizable number of queries and their corresponding correct answers in the Process Handbook. We devise diverse similarity algorithms based on values of process attributes and structures of business processes. We perform retrieval experiments to compare the performance of the devised similarity algorithms.

  • PDF

The Strength of the Relationship between Semantic Similarity and the Subcategorization Frames of the English Verbs: a Stochastic Test based on the ICE-GB and WordNet (영어 동사의 의미적 유사도와 논항 선택 사이의 연관성 : ICE-GB와 WordNet을 이용한 통계적 검증)

  • Song, Sang-Houn;Choe, Jae-Woong
    • Language and Information
    • /
    • v.14 no.1
    • /
    • pp.113-144
    • /
    • 2010
  • The primary goal of this paper is to find a feasible way to answer the question: Does the similarity in meaning between verbs relate to the similarity in their subcategorization? In order to answer this question in a rather concrete way on the basis of a large set of English verbs, this study made use of various language resources, tools, and statistical methodologies. We first compiled a list of 678 verbs that were selected from the most and second most frequent word lists from the Colins Cobuild English Dictionary, which also appeared in WordNet 3.0. We calculated similarity measures between all the pairs of the words based on the 'jcn' algorithm (Jiang and Conrath, 1997) implemented in the WordNet::Similarity module (Pedersen, Patwardhan, and Michelizzi, 2004). The clustering process followed, first building similarity matrices out of the similarity measure values, next drawing dendrograms on the basis of the matricies, then finally getting 177 meaningful clusters (covering 437 verbs) that passed a certain level set by z-score. The subcategorization frames and their frequency values were taken from the ICE-GB. In order to calculate the Selectional Preference Strength (SPS) of the relationship between a verb and its subcategorizations, we relied on the Kullback-Leibler Divergence model (Resnik, 1996). The SPS values of the verbs in the same cluster were compared with each other, which served to give the statistical values that indicate how much the SPS values overlap between the subcategorization frames of the verbs. Our final analysis shows that the degree of overlap, or the relationship between semantic similarity and the subcategorization frames of the verbs in English, is equally spread out from the 'very strongly related' to the 'very weakly related'. Some semantically similar verbs share a lot in terms of their subcategorization frames, and some others indicate an average degree of strength in the relationship, while the others, though still semantically similar, tend to share little in their subcategorization frames.

  • PDF

Ontology Selection Ranking Model based on Semantic Similarity Approach (의미적 유사성에 기반한 온톨로지 선택 랭킹 모델)

  • Oh, Sun-Ju;Ahn, Joong-Ho;Park, Jin-Soo
    • The Journal of Society for e-Business Studies
    • /
    • v.14 no.2
    • /
    • pp.95-116
    • /
    • 2009
  • Ontologies have provided supports in integrating heterogeneous and distributed information. More and more ontologies and tools have been developed in various domains. However, building ontologies requires much time and effort. Therefore, ontologies need to be shared and reused among users. Specifically, finding the desired ontology from an ontology repository will benefit users. In the past, most of the studies on retrieving and ranking ontologies have mainly focused on lexical level supports. In those cases, it is impossible to find an ontology that includes concepts that users want to use at the semantic level. Most ontology libraries and ontology search engines have not provided semantic matching capability. Retrieving an ontology that users want to use requires a new ontology selection and ranking mechanism based on semantic similarity matching. We propose an ontology selection and ranking model consisting of selection criteria and metrics which are enhanced in semantic matching capabilities. The model we propose presents two novel features different from the previous research models. First, it enhances the ontology selection and ranking method practically and effectively by enabling semantic matching of taxonomy or relational linkage between concepts. Second, it identifies what measures should be used to rank ontologies in the given context and what weight should be assigned to each selection measure.

  • PDF

Semantic Image Retrieval Using Color Distribution and Similarity Measurement in WordNet (컬러 분포와 WordNet상의 유사도 측정을 이용한 의미적 이미지 검색)

  • Choi, Jun-Ho;Cho, Mi-Young;Kim, Pan-Koo
    • The KIPS Transactions:PartB
    • /
    • v.11B no.4
    • /
    • pp.509-516
    • /
    • 2004
  • Semantic interpretation of image is incomplete without some mechanism for understanding semantic content that is not directly visible. For this reason, human assisted content-annotation through natural language is an attachment of textual description to image. However, keyword-based retrieval is in the level of syntactic pattern matching. In other words, dissimilarity computation among terms is usually done by using string matching not concept matching. In this paper, we propose a method for computerized semantic similarity calculation In WordNet space. We consider the edge, depth, link type and density as well as existence of common ancestors. Also, we have introduced method that applied similarity measurement on semantic image retrieval. To combine wi#h the low level features, we use the spatial color distribution model. When tested on a image set of Microsoft's 'Design Gallery Line', proposed method outperforms other approach.

Ontology Semantic Mapping based Data Integration of CAD and PDM System (온톨로지 의미 매핑 기반 CAD 및 PDM 시스템 정보 통합)

  • Lee Min-Jung;Jung Won-Cheol;Lee Jae-Hyun;Suh Hyo-Won
    • Proceedings of the Korean Society of Precision Engineering Conference
    • /
    • 2005.06a
    • /
    • pp.181-186
    • /
    • 2005
  • In collaborative environment, it is necessary that the participants in collaboration should share the same understanding about the semantics of terms. For example, they should know that 'Part' and 'Item' are different word-expressions for the same meaning. In this paper, we consider sharing between CAD and PDM data. In order to handle such problems in information sharing, an information system needs to automatically recognize that the terms have the same semantics. Serving this purpose, the semantic mapping logic and the ontology based mapper system is described in this paper. In the semantic mapping logic topic, we introduce our logic that consists of four modules: Character Matching, Instance Reasoning, definition comparing and Similarity Checking. In the ontology based mapper, we introduce the system architecture and the mapping procedure.

  • PDF

Web Site Keyword Selection Method by Considering Semantic Similarity Based on Word2Vec (Word2Vec 기반의 의미적 유사도를 고려한 웹사이트 키워드 선택 기법)

  • Lee, Donghun;Kim, Kwanho
    • The Journal of Society for e-Business Studies
    • /
    • v.23 no.2
    • /
    • pp.83-96
    • /
    • 2018
  • Extracting keywords representing documents is very important because it can be used for automated services such as document search, classification, recommendation system as well as quickly transmitting document information. However, when extracting keywords based on the frequency of words appearing in a web site documents and graph algorithms based on the co-occurrence of words, the problem of containing various words that are not related to the topic potentially in the web page structure, There is a difficulty in extracting the semantic keyword due to the limit of the performance of the Korean tokenizer. In this paper, we propose a method to select candidate keywords based on semantic similarity, and solve the problem that semantic keyword can not be extracted and the accuracy of Korean tokenizer analysis is poor. Finally, we use the technique of extracting final semantic keywords through filtering process to remove inconsistent keywords. Experimental results through real web pages of small business show that the performance of the proposed method is improved by 34.52% over the statistical similarity based keyword selection technique. Therefore, it is confirmed that the performance of extracting keywords from documents is improved by considering semantic similarity between words and removing inconsistent keywords.