• 제목/요약/키워드: Semantic entity

검색결과 67건 처리시간 0.045초

개체중의성해소에서 의미관련도 활용 효과 분석: 한국어 위키피디아를 사용하여 (An Effect of Semantic Relatedness on Entity Disambiguation: Using Korean Wikipedia)

  • 강인수
    • 한국지능시스템학회논문지
    • /
    • 제25권2호
    • /
    • pp.111-118
    • /
    • 2015
  • 개체 링킹은 텍스트에 출현하는 개체 표현을 위키피디아 등의 지식베이스 항목으로 연결하는 작업이다. 동일한 개체 표현을 공유하는 서로 다른 개체들의 존재로 인해 개체 링킹에서는 개체 표현의 중의성을 해소할 필요가 있다. 개체 중의성 해소를 위한 최근 연구에서는 공기 개체 의미관련도를 중심으로 개체 출현 선험 확률와 공기 용어 정보 등을 결합하는 시도들이 주류를 형성하고 있다. 그러나 의미관련도의 왕성한 활용에도 불구하고 의미관련도 기반 방법이 개체중의성해소에 미치는 순수 효과를 분석 제시한 연구는 찾기 힘들다. 이 연구는 NGD, PMI, Jaccard, Dice, Simpson 등 서로 다른 의미관련도 지표의 차이, 공기개체집합 내 중의성 정도의 차이, 개별적/집단적 중의성해소 방식의 차이의 세 가지 관점에서 의미관련도 기반 개체중의성해소 방법들을 한국어 위키피디아 데이터를 사용하여 실험적으로 평가한 결과를 제시한다.

쇼핑몰 데이터베이스 설계를 위한 의미객체 모델링 (Semantic Object Modeling for Shopping Mall Database Design)

  • 전태보;김기동;오준형
    • 산업기술연구
    • /
    • 제25권A호
    • /
    • pp.123-131
    • /
    • 2005
  • Semantic object model has widely been recognized as an alternative data modeling approach to entity-relationship model for database system design. In this study, we have presented a semantic object model for intermediary type shopping mall consisting of multiple buyers and sellers. Essential processes and information with regard to the customer management, product management, price estimation, product order etc. have been considered for this study. Upon careful examination and analysis of them, a detailed semantic objects and attributes have been drawn and structured into semantic object diagrams. The final objects were converted into an entity-relationship diagram so that intuitive comparison could be made for relational database design. The results in this study may form a conceptual framework for both academic concerns and more complicated system applications.

  • PDF

MSFM: Multi-view Semantic Feature Fusion Model for Chinese Named Entity Recognition

  • Liu, Jingxin;Cheng, Jieren;Peng, Xin;Zhao, Zeli;Tang, Xiangyan;Sheng, Victor S.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권6호
    • /
    • pp.1833-1848
    • /
    • 2022
  • Named entity recognition (NER) is an important basic task in the field of Natural Language Processing (NLP). Recently deep learning approaches by extracting word segmentation or character features have been proved to be effective for Chinese Named Entity Recognition (CNER). However, since this method of extracting features only focuses on extracting some of the features, it lacks textual information mining from multiple perspectives and dimensions, resulting in the model not being able to fully capture semantic features. To tackle this problem, we propose a novel Multi-view Semantic Feature Fusion Model (MSFM). The proposed model mainly consists of two core components, that is, Multi-view Semantic Feature Fusion Embedding Module (MFEM) and Multi-head Self-Attention Mechanism Module (MSAM). Specifically, the MFEM extracts character features, word boundary features, radical features, and pinyin features of Chinese characters. The acquired font shape, font sound, and font meaning features are fused to enhance the semantic information of Chinese characters with different granularities. Moreover, the MSAM is used to capture the dependencies between characters in a multi-dimensional subspace to better understand the semantic features of the context. Extensive experimental results on four benchmark datasets show that our method improves the overall performance of the CNER model.

Semantic-based Mashup Platform for Contents Convergence

  • Yongju Lee;Hongzhou Duan;Yuxiang Sun
    • International journal of advanced smart convergence
    • /
    • 제12권2호
    • /
    • pp.34-46
    • /
    • 2023
  • A growing number of large scale knowledge graphs raises several issues how knowledge graph data can be organized, discovered, and integrated efficiently. We present a novel semantic-based mashup platform for contents convergence which consists of acquisition, RDF storage, ontology learning, and mashup subsystems. This platform servers a basis for developing other more sophisticated applications required in the area of knowledge big data. Moreover, this paper proposes an entity matching method using graph convolutional network techniques as a preliminary work for automatic classification and discovery on knowledge big data. Using real DBP15K and SRPRS datasets, the performance of our method is compared with some existing entity matching methods. The experimental results show that the proposed method outperforms existing methods due to its ability to increase accuracy and reduce training time.

한국어 텍스트 문장정렬을 위한 개체격자 접근법과 LSA 기반 접근법의 활용연구 (A comparative study of Entity-Grid and LSA models on Korean sentence ordering)

  • 김영삼;김홍기;신효필
    • 인지과학
    • /
    • 제24권4호
    • /
    • pp.301-321
    • /
    • 2013
  • 본 논문은 텍스트의 응집도 측정과 텍스트 자동생성 시스템을 위한 기초기술 중 하나인 문장정렬 과제에 대한 연구로, 개체기반적(entity-based) 접근의 한 유형인 개체격자 모형(Entity-Grid model)과 벡터공간 모형에 기반한 LSA(Latent Semantic Analysis)를 모두 시도하고 결과를 서로 비교하였다. 개체격자 모형에 대한 기존 연구들에서 논의된 명사들의 통사역(syntactic role) 정보가 한국어 텍스트 정렬과제에 미치는 영향을 실험하고자 하였으며, 기존 독일어권 응용연구 결과와는 달리 긍정적인 결과를 얻었다. 이 과정에서 한국어의 격조사를 활용하는 전략을 취했으며, 이는 한국어의 격표지 정보가 한국어 텍스트의 응집성을 측정하는 데에 유용할 수 있다는 점을 보인 것이다. 그리고 개체격자 모형을 통한 결과를 LSA 기반 모형결과와 비교하여 양 모형의 장단점과 향후 개선점을 아울러 논의하였다.

  • PDF

A Muti-Resolution Approach to Restaurant Named Entity Recognition in Korean Web

  • Kang, Bo-Yeong;Kim, Dae-Won
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제12권4호
    • /
    • pp.277-284
    • /
    • 2012
  • Named entity recognition (NER) technique can play a crucial role in extracting information from the web. While NER systems with relatively high performances have been developed based on careful manipulation of terms with a statistical model, term mismatches often degrade the performance of such systems because the strings of all the candidate entities are not known a priori. Despite the importance of lexical-level term mismatches for NER systems, however, most NER approaches developed to date utilize only the term string itself and simple term-level features, and do not exploit the semantic features of terms which can handle the variations of terms effectively. As a solution to this problem, here we propose to match the semantic concepts of term units in restaurant named entities (NEs), where these units are automatically generated from multiple resolutions of a semantic tree. As a test experiment, we applied our restaurant NER scheme to 49,153 nouns in Korean restaurant web pages. Our scheme achieved an average accuracy of 87.89% when applied to test data, which was considerably better than the 78.70% accuracy obtained using the baseline system.

Standard Terminology System Referenced by 3D Human Body Model

  • Choi, Byung-Kwan;Lim, Ji-Hye
    • Journal of information and communication convergence engineering
    • /
    • 제17권2호
    • /
    • pp.91-96
    • /
    • 2019
  • In this study, a system to increase the expressiveness of existing standard terminology using three-dimensional (3D) data is designed. We analyze the existing medical terminology system by searching the reference literature and perform an expert group focus survey. A human body image is generated using a 3D modeling tool. Then, the anatomical position of the human body is mapped to the 3D coordinates' identification (ID) and metadata. We define the term to represent the 3D human body position in a total of 12 categories, including semantic terminology entity and semantic disorder. The Blender and 3ds Max programs are used to create the 3D model from medical imaging data. The generated 3D human body model is expressed by the ID of the coordinate type (x, y, and z axes) based on the anatomical position and mapped to the semantic entity including the meaning. We propose a system of standard terminology enabling integration and utilization of the 3D human body model, coordinates (ID), and metadata. In the future, through cooperation with the Electronic Health Record system, we will contribute to clinical research to generate higher-quality big data.

Semantic Modeling for SNPs Associated with Ethnic Disparities in HapMap Samples

  • Kim, HyoYoung;Yoo, Won Gi;Park, Junhyung;Kim, Heebal;Kang, Byeong-Chul
    • Genomics & Informatics
    • /
    • 제12권1호
    • /
    • pp.35-41
    • /
    • 2014
  • Single-nucleotide polymorphisms (SNPs) have been emerging out of the efforts to research human diseases and ethnic disparities. A semantic network is needed for in-depth understanding of the impacts of SNPs, because phenotypes are modulated by complex networks, including biochemical and physiological pathways. We identified ethnicity-specific SNPs by eliminating overlapped SNPs from HapMap samples, and the ethnicity-specific SNPs were mapped to the UCSC RefGene lists. Ethnicity-specific genes were identified as follows: 22 genes in the USA (CEU) individuals, 25 genes in the Japanese (JPT) individuals, and 332 genes in the African (YRI) individuals. To analyze the biologically functional implications for ethnicity-specific SNPs, we focused on constructing a semantic network model. Entities for the network represented by "Gene," "Pathway," "Disease," "Chemical," "Drug," "ClinicalTrials," "SNP," and relationships between entity-entity were obtained through curation. Our semantic modeling for ethnicity-specific SNPs showed interesting results in the three categories, including three diseases ("AIDS-associated nephropathy," "Hypertension," and "Pelvic infection"), one drug ("Methylphenidate"), and five pathways ("Hemostasis," "Systemic lupus erythematosus," "Prostate cancer," "Hepatitis C virus," and "Rheumatoid arthritis"). We found ethnicity-specific genes using the semantic modeling, and the majority of our findings was consistent with the previous studies - that an understanding of genetic variability explained ethnicity-specific disparities.

Acquisition of Named-Entity-Related Relations for Searching

  • Nguyen, Tri-Thanh;Shimazu, Akira
    • 한국언어정보학회:학술대회논문집
    • /
    • 한국언어정보학회 2007년도 정기학술대회
    • /
    • pp.349-357
    • /
    • 2007
  • Named entities (NEs) are important in many Natural Language Processing (NLP) applications, and discovering NE-related relations in texts may be beneficial for these applications. This paper proposes a method to extract the ISA relation between a "named entity" and its category, and an IS-RELATED-TO relation between the category and its related object. Based on the pattern extraction algorithm "Person Category Extraction" (PCE), we extend it for solving our problem. Our experiments on Wall Street Journal (WSJ) corpus show promising results. We also demonstrate a possible application of these relations by utilizing them for semantic search.

  • PDF

전자카탈로그에서의 의미적 관계 분석과 모델링 (Analysis and Modeling of Semantic Relationships in e-Catalog Domain)

  • 이민정;이현자;심준호
    • 한국전자거래학회지
    • /
    • 제9권3호
    • /
    • pp.243-258
    • /
    • 2004
  • 도메인별로 적합한 온톨로지를 구축하는 것은 시멘틱웹의 한 구현 방법으로서 해당 도메인 응용프로그램의 의미적 정보를 풍부히 해준다. 전자상거래에서 카탈로그는 상품과 서비스에 대한 가격, 특성, 조건 등의 다양한 정보를 저장 관리해주는데, 카탈로그에는 상품 개별 정보뿐 아니라 상품간에도 다양한 정보가 내재되어 존재하게 된다. 따라서 전자카탈로그 영역에 온톨로지를 적용하기 위하여는 먼저 카탈로그 영역에서 존재하는 의미적 관계(semantic relationships)를 분석해 보는 것은 의미 있는 일이다. 본 논문에서는 카탈로그의 의미적 관계를 분류체계를 통해 분석해보고, 각각의 관계가 어떻게 온톨로지 모델화 될 수 있는지를 제시한다. 모델링 기법으로서는 기본적으로 EER(Extended Entity-Relationship)을 사용하는데, 이에 제한되지 않고 궁극적으로 온톨로지 모델을 사용하여 추론을 할 수 있도록 모델을 Description Logics으로 표현한다.

  • PDF