• Title/Summary/Keyword: Entity

Search Result 2,104, Processing Time 0.035 seconds

A Method to Solve the Entity Linking Ambiguity and NIL Entity Recognition for efficient Entity Linking based on Wikipedia (위키피디아 기반의 효과적인 개체 링킹을 위한 NIL 개체 인식과 개체 연결 중의성 해소 방법)

  • Lee, Hokyung;An, Jaehyun;Yoon, Jeongmin;Bae, Kyoungman;Ko, Youngjoong
    • Journal of KIISE
    • /
    • v.44 no.8
    • /
    • pp.813-821
    • /
    • 2017
  • Entity Linking find the meaning of an entity mention, which indicate the entity using different expressions, in a user's query by linking the entity mention and the entity in the knowledge base. This task has four challenges, including the difficult knowledge base construction problem, multiple presentation of the entity mention, ambiguity of entity linking, and NIL entity recognition. In this paper, we first construct the entity name dictionary based on Wikipedia to build a knowledge base and solve the multiple presentation problem. We then propose various methods for NIL entity recognition and solve the ambiguity of entity linking by training the support vector machine based on several features, including the similarity of the context, semantic relevance, clue word score, named entity type similarity of the mansion, entity name matching score, and object popularity score. We sequentially use the proposed two methods based on the constructed knowledge base, to obtain the good performance in the entity linking. In the result of the experiment, our system achieved 83.66% and 90.81% F1 score, which is the performance of the NIL entity recognition to solve the ambiguity of the entity linking.

A Study on the Performance Analysis of Entity Name Recognition Techniques Using Korean Patent Literature

  • Gim, Jangwon
    • Journal of Advanced Information Technology and Convergence
    • /
    • v.10 no.2
    • /
    • pp.139-151
    • /
    • 2020
  • Entity name recognition is a part of information extraction that extracts entity names from documents and classifies the types of extracted entity names. Entity name recognition technologies are widely used in natural language processing, such as information retrieval, machine translation, and query response systems. Various deep learning-based models exist to improve entity name recognition performance, but studies that compared and analyzed these models on Korean data are insufficient. In this paper, we compare and analyze the performance of CRF, LSTM-CRF, BiLSTM-CRF, and BERT, which are actively used to identify entity names using Korean data. Also, we compare and evaluate whether embedding models, which are variously used in recent natural language processing tasks, can affect the entity name recognition model's performance improvement. As a result of experiments on patent data and Korean corpus, it was confirmed that the BiLSTM-CRF using FastText method showed the highest performance.

An Effect of Semantic Relatedness on Entity Disambiguation: Using Korean Wikipedia (개체중의성해소에서 의미관련도 활용 효과 분석: 한국어 위키피디아를 사용하여)

  • Kang, In-Su
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.25 no.2
    • /
    • pp.111-118
    • /
    • 2015
  • Entity linking is to link entity's name mentions occurring in text to corresponding entities within knowledge bases. Since the same entity mention may refer to different entities according to their context, entity linking needs to deal with entity disambiguation. Most recent works on entity disambiguation focus on semantic relatedness between entities and attempt to integrate semantic relatedness with entity prior probabilities and term co-occurrence. To the best of my knowledge, however, it is hard to find studies that analyze and present the pure effects of semantic relatedness on entity disambiguation. From the experimentation on Korean Wikipedia data set, this article empirically evaluates entity disambiguation approaches using semantic relatedness in terms of the following aspects: (1) the difference among semantic relatedness measures such as NGD, PMI, Jaccard, Dice, Simpson, (2) the influence of ambiguities in co-occurring entity mentions' set, and (3) the difference between individual and collective disambiguation approaches.

Classifying Articles in Chinese Wikipedia with Fine-Grained Named Entity Types

  • Zhou, Jie;Li, Bicheng;Tang, Yongwang
    • Journal of Computing Science and Engineering
    • /
    • v.8 no.3
    • /
    • pp.137-148
    • /
    • 2014
  • Named entity classification of Wikipedia articles is a fundamental research area that can be used to automatically build large-scale corpora of named entity recognition or to support other entity processing, such as entity linking, as auxiliary tasks. This paper describes a method of classifying named entities in Chinese Wikipedia with fine-grained types. We considered multi-faceted information in Chinese Wikipedia to construct four feature sets, designed different feature selection methods for each feature, and fused different features with a vector space using different strategies. Experimental results show that the explored feature sets and their combination can effectively improve the performance of named entity classification.

Object-Oriented Programming of Entity-Based Integrated Design Model (개체형 통합설계모델의 객체지향 프로그래밍)

  • 이창호;김진근
    • Proceedings of the Computational Structural Engineering Institute Conference
    • /
    • 2002.10a
    • /
    • pp.211-218
    • /
    • 2002
  • An entity-based integrated design product and process model uses product and process entities to describe design information and design activities, respectively. The concepts and notation for product and process entities in the entity-based integrated design model are similar to the concepts of object-oriented programming languages such as C++ and Smalltalk. This paper uses C++ to program an entity-based integrated design model for building frames structures. The design information and activities involved in the three dimensional building space, the locations of frames, and the grouping of frames represented as entities in the entity-based integrated design model are transformed to C++ codes. Each product or process entity can be basically transformed to an class. The attributes of an entity can be defined as variables and member functions of a class.

  • PDF

An Entity-Aspect Model for Statistical and Scientific Databases (통계(統計)/과학(科學) 데이타 베이스를 위한 개체(個體)-측면(側面) 모형(模型))

  • Yoo, Cheol-Jung
    • Proceedings of the KIEE Conference
    • /
    • 1987.07b
    • /
    • pp.1148-1152
    • /
    • 1987
  • This paper analyzes the statistical and scientific entity-aspect model for statistical and scientific databases(SSDB's). The statistical and scientific entity-aspect model(SEAM) is defined an example of the application of the statistical and scientific entity-aspect model is represented. Finally, the statistical and scientific entity-aspect model as a design tool for SSDB is evaluated and the further research areas are suggested.

  • PDF

ERX : A Generation Tool of XML Schema based on Entity-Relationship Model (ERX : 개체 관계 모델로부터 XML 스키마 생성 도구)

  • Kim, Young-Ung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.13 no.2
    • /
    • pp.149-155
    • /
    • 2013
  • In these days, Entity-Relationship Model is the most popular modeling tool for designing databases, and XML is a de facto standard language for representing and exchanging data. But, because of many commercial products supporting Entity-Relationship Model use their's own representation formats, and thus it gives rise to difficulties the inter-operability between these products. In this paper, we propose an ERX, a generation tool of XML Schema from Entity-Relationship Model. ERX receives an Entity-Relationship Diagram as an input, transforms it based on transformation rules, and generates a XML Schema Definition as an output. Transformation rules contain entity set, relationship set, mapping cardinalities, and generalization.

Tweet Entity Linking Method based on User Similarity for Entity Disambiguation (개체 중의성 해소를 위한 사용자 유사도 기반의 트윗 개체 링킹 기법)

  • Kim, SeoHyun;Seo, YoungDuk;Baik, Doo-Kwon
    • Journal of KIISE
    • /
    • v.43 no.9
    • /
    • pp.1043-1051
    • /
    • 2016
  • Web based entity linking cannot be applied in tweet entity linking because twitter documents are shorter in comparison to web documents. Therefore, tweet entity linking uses the information of users or groups. However, data sparseness problem is occurred due to the users with the inadequate number of twitter experience data; in addition, a negative impact on the accuracy of the linking result for users is possible when using the information of unrelated groups. To solve the data sparseness problem, we consider three features including the meanings from single tweets, the users' own tweet set and the sets of other users' tweets. Furthermore, we improve the performance and the accuracy of the tweet entity linking by assigning a weight to the information of users with a high similarity. Through a comparative experiment using actual twitter data, we verify that the proposed tweet entity linking has higher performance and accuracy than existing methods, and has a correlation with solving the data sparseness problem and improved linking accuracy for use of information of high similarity users.

A Global-Interdependence Pairwise Approach to Entity Linking Using RDF Knowledge Graph (개체 링킹을 위한 RDF 지식그래프 기반의 포괄적 상호의존성 짝 연결 접근법)

  • Shim, Yongsun;Yang, Sungkwon;Kim, Hong-Gee
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.3
    • /
    • pp.129-136
    • /
    • 2019
  • There are a variety of entities in natural language such as people, organizations, places, and products. These entities can have many various meanings. The ambiguity of entity is a very challenging task in the field of natural language processing. Entity Linking(EL) is the task of linking the entity in the text to the appropriate entity in the knowledge base. Pairwise based approach, which is a representative method for solving the EL, is a method of solving the EL by using the association between two entities in a sentence. This method considers only the interdependence between entities appearing in the same sentence, and thus has a limitation of global interdependence. In this paper, we developed an Entity2vec model that uses Word2vec based on knowledge base of RDF type in order to solve the EL. And we applied the algorithms using the generated model and ranked each entity. In this paper, to overcome the limitations of a pairwise approach, we devised a pairwise approach based on comprehensive interdependency and compared it.

A Study on Elicitation Procedures of the Entity for Data Model (데이터 모델을 위한 엔터티 도출 절차에 관한 연구)

  • Kim, Doyu;Yeo, Jeongmo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.7
    • /
    • pp.479-486
    • /
    • 2013
  • The data model that can be said as skeleton of the information system constitutes important 2 axles in the information system together with the process model. There is entity, properties, relation as key factors of the data model, and entity is the most fundamental factor in the data model, and thus total data model becomes vague if not deriving entity definitely. This study dealt with entity deduction only. Deducing methods of existing entity depended on experiences, task knowledge of designers and clear procedures were not suggested, so there were many difficulties in approaching them from beginners or unskilled persons. For giving helps in solving the problem, this study proposes entity- deducing procedures based on tasks that can derive entity with a systematic process at previously derived target businesses through suggested methods from advancing researches. And the study enabled proposing procedures on imaginary tasks to be applied, objecting to undergraduates who had not experiences on the data modeling, and then verified suggesting process through a similarity checking between best answers with deduced entity by students after taking impossible points of comparing existing methods with suggesting process into consideration. By doing so, deducing entity closely to the best answer was confirmed accordingly. Therefore, a fact could be confirmed that beginners were able to deduce entity closely to the best answer even if letting beginners who had not experiences on the data modeling be applied to unfamiliar tasks. Regarding researches on properties and relation deduction besides entity, this study leaves them to next time.