• Title/Summary/Keyword: entity name

Search Result 65, Processing Time 0.024 seconds

A Study on the Performance Analysis of Entity Name Recognition Techniques Using Korean Patent Literature

  • Gim, Jangwon
    • Journal of Advanced Information Technology and Convergence
    • /
    • v.10 no.2
    • /
    • pp.139-151
    • /
    • 2020
  • Entity name recognition is a part of information extraction that extracts entity names from documents and classifies the types of extracted entity names. Entity name recognition technologies are widely used in natural language processing, such as information retrieval, machine translation, and query response systems. Various deep learning-based models exist to improve entity name recognition performance, but studies that compared and analyzed these models on Korean data are insufficient. In this paper, we compare and analyze the performance of CRF, LSTM-CRF, BiLSTM-CRF, and BERT, which are actively used to identify entity names using Korean data. Also, we compare and evaluate whether embedding models, which are variously used in recent natural language processing tasks, can affect the entity name recognition model's performance improvement. As a result of experiments on patent data and Korean corpus, it was confirmed that the BiLSTM-CRF using FastText method showed the highest performance.

Topic conversation performance improvement technology through game domain entity name recognition and deep learning intention classification (게임 도메인 개체명인식과 딥러닝 의도분류를 통한 주제대화 성능향상 기술)

  • Yun, Jae-Min;Jee, Min-Seong;Shin, Dong-Chun;Ko, Yeon-Jeong
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2021.01a
    • /
    • pp.241-242
    • /
    • 2021
  • 대화시스템에서 게임설명요청과 같은 주제대화의 경우, 입력문장의 의도를 정확하게 분류하는 것이 대화시스템 성능과 직결되므로 매우 중요하다. 본 논문에서는 개체명 인식 방법과 머신러닝 방법을 결합한 하이브리드 방법을 제안하여, 머신러닝 방법을 단독으로 사용하는 방법보다 주제대화의 의도 분류 성능을 향상시켰다.

  • PDF

An Introduction to the Study of the Outlook on Highest Ruling Entity in Daesoonjinrohoe (I) - Focusing on Descriptions for Highest Ruling Entity and It's Meanings - (대순진리회 상제관 연구 서설 (I) - 최고신에 대한 표현들과 그 의미들을 중심으로 -)

  • Cha, Seon-keun
    • Journal of the Daesoon Academy of Sciences
    • /
    • v.21
    • /
    • pp.99-156
    • /
    • 2013
  • This paper is to indicate research tendencies of faith in Daesoonjinrihoe and controversial points of those, and to consider the outlook on Sangje after defining it as theological understanding and explanation for Gu-Cheon-Sang-Je (High-est ruling Entity that is the object of devotion in Daesoon-jinrihoe). As the first introduction to the work, various descriptions for Sangje are arranged and the meanings of those are analyzed. In brief, first, the name of Gu-Cheon-Eung-Won-Nweh-Seong-Bo-Hwa-Cheon-Jon, expresses the fact that the authority of Sangje (the Supreme Entity) is exposed by spatial concept Sangje dwells in Ninth Heaven. This fact can be compared with the doctrines Allah in Islam and Jehovah in Christianity each are dwelled in Seventh Heaven. And the name shows Sangje is the ruler who reigns over the universe by using yin and yang. Second, the name, Gu-Cheon-Eung-Won-Nweh-Seong-BoHwa-Cheon-Jon, is imported from China Taoism because it has been in Ok-Chu-Gyeong (the Gaoshang shenlei yushu). But in fact it's root is in Korea because Buyeo and Goguryeo, the ancient Korean nations, have the source of the name. While the name is not the Supreme Entity in China Taoism, it is the Supreme Entity in Daesoonjinrihoe. This fact is a important difference. Third, arbitrarily or not, the name, Gu-Cheon-Eung-Won-Nweh-Seong-Bo-Hwa-Cheon-Jon, is put on the image of 'resolution of grievances'. The reason is that many peoples in Korea and China has called the name for about 1,000 years ago to help their fortunes and escape predicaments. Forth, not only Gu-Cheon-Eung-Won-Nweh-Seong-Bo-Hwa-Cheon-Jon but also the name, Three Pure Ones and Ok-Cheon-Jin-Wang (Yuqingzhenwang) in China Taoism used as the Highest ruling Entity in Daesoonjinrihoe. But the relations between three Pure Ones and Ok-Cheon-Jin-Wang and Gu-Cheon-Eung-Won-Nweh-Seong-Bo-Hwa-Cheon-Jon in Dae-soonjinrihoe are different from that in China Taoism. Fifth, Sangje is associated with the Polaris divinity of Tae-Eul, view on God in Oriental Cosmology. The description Tae-Eul as well as Gu-Cheon-Eung-Won-Nweh-Seong-Bo-Hwa-Cheon-Jon is indicated Sangje is linked to the faith of Buyeo and Goguryeo. Sixth, Sangje is not only Mugeuk-Sin (The God of The Endless) who supervise the Endless but also Taegeuk-Ji-Cheon-Jon (The God of The Ultimate Reality) who supervise the Ultimate Reality. These descriptions directly display the fact Sangje is a creator. Seventh, in case explaining Sangje, the point of view is necessary that grasps the whole viewpoints Sangje 'was' Hidden God(deus otiosus) and 'is' Unhidden God after Incarnation. Eighth, Sangje is Cheon-Ju in Donghak, but different from that. Cheon-Ju in Donghak has both transcendence and immanence in tightrope tension, but Cheon-Ju in Daesoonjinrihoe emphasize transcendence than immanence. That difference is the result of the fact Cheon-Ju in Donghak was a being having revealed a man and Cheon-Ju in Daesoonjinrihoe was a being having incarnated after revealing a man. Ninth, Sangje is Gae-Byeok-Jang who is the manager of the transforming and ordering the Three Realms of the World by the Great Do which is the mutual beneficence of all life and Hae-Won-Sin who is the God of resolution of grievances.

A Method to Solve the Entity Linking Ambiguity and NIL Entity Recognition for efficient Entity Linking based on Wikipedia (위키피디아 기반의 효과적인 개체 링킹을 위한 NIL 개체 인식과 개체 연결 중의성 해소 방법)

  • Lee, Hokyung;An, Jaehyun;Yoon, Jeongmin;Bae, Kyoungman;Ko, Youngjoong
    • Journal of KIISE
    • /
    • v.44 no.8
    • /
    • pp.813-821
    • /
    • 2017
  • Entity Linking find the meaning of an entity mention, which indicate the entity using different expressions, in a user's query by linking the entity mention and the entity in the knowledge base. This task has four challenges, including the difficult knowledge base construction problem, multiple presentation of the entity mention, ambiguity of entity linking, and NIL entity recognition. In this paper, we first construct the entity name dictionary based on Wikipedia to build a knowledge base and solve the multiple presentation problem. We then propose various methods for NIL entity recognition and solve the ambiguity of entity linking by training the support vector machine based on several features, including the similarity of the context, semantic relevance, clue word score, named entity type similarity of the mansion, entity name matching score, and object popularity score. We sequentially use the proposed two methods based on the constructed knowledge base, to obtain the good performance in the entity linking. In the result of the experiment, our system achieved 83.66% and 90.81% F1 score, which is the performance of the NIL entity recognition to solve the ambiguity of the entity linking.

A Study on the Description of Archives Name by Controlled Access Point in Ontology (기록물 생산기관명 접근점 제어 온톨로지 기술에 관한 연구)

  • Kang, Hyen Min
    • Journal of the Korean Society for information Management
    • /
    • v.35 no.3
    • /
    • pp.147-164
    • /
    • 2018
  • This study defined the name of records producing institution as the unique preferred form of access point which has same identification and same entity by using Standard Administration Code, and also described the name of records producing institution which has various name form as formal-name form of access point, which has same identification and same entity. This study make us be able to identify and access to all of the records that institution, has same identification and same entity, has produced. And the mechanic, I designed by ontology would make reinforce 'the principle of provenance' and 'respect for orignal order' and make user satisfy in useability of archives and expanded retrieval results.

Bi-directional LSTM-CNN-CRF for Korean Named Entity Recognition System with Feature Augmentation (자질 보강과 양방향 LSTM-CNN-CRF 기반의 한국어 개체명 인식 모델)

  • Lee, DongYub;Yu, Wonhee;Lim, HeuiSeok
    • Journal of the Korea Convergence Society
    • /
    • v.8 no.12
    • /
    • pp.55-62
    • /
    • 2017
  • The Named Entity Recognition system is a system that recognizes words or phrases with object names such as personal name (PS), place name (LC), and group name (OG) in the document as corresponding object names. Traditional approaches to named entity recognition include statistical-based models that learn models based on hand-crafted features. Recently, it has been proposed to construct the qualities expressing the sentence using models such as deep-learning based Recurrent Neural Networks (RNN) and long-short term memory (LSTM) to solve the problem of sequence labeling. In this research, to improve the performance of the Korean named entity recognition system, we used a hand-crafted feature, part-of-speech tagging information, and pre-built lexicon information to augment features for representing sentence. Experimental results show that the proposed method improves the performance of Korean named entity recognition system. The results of this study are presented through github for future collaborative research with researchers studying Korean Natural Language Processing (NLP) and named entity recognition system.

KONG-DB: Korean Novel Geo-name DB & Search and Visualization System Using Dictionary from the Web (KONG-DB: 웹 상의 어휘 사전을 활용한 한국 소설 지명 DB, 검색 및 시각화 시스템)

  • Park, Sung Hee
    • Journal of the Korean Society for information Management
    • /
    • v.33 no.3
    • /
    • pp.321-343
    • /
    • 2016
  • This study aimed to design a semi-automatic web-based pilot system 1) to build a Korean novel geo-name, 2) to update the database using automatic geo-name extraction for a scalable database, and 3) to retrieve/visualize the usage of an old geo-name on the map. In particular, the problem of extracting novel geo-names, which are currently obsolete, is difficult to solve because obtaining a corpus used for training dataset is burden. To build a corpus for training data, an admin tool, HTML crawler and parser in Python, crawled geo-names and usages from a vocabulary dictionary for Korean New Novel enough to train a named entity tagger for extracting even novel geo-names not shown up in a training corpus. By means of a training corpus and an automatic extraction tool, the geo-name database was made scalable. In addition, the system can visualize the geo-name on the map. The work of study also designed, implemented the prototype and empirically verified the validity of the pilot system. Lastly, items to be improved have also been addressed.

Performance Comparison and Error Analysis of Korean Bio-medical Named Entity Recognition (한국어 생의학 개체명 인식 성능 비교와 오류 분석)

  • Jae-Hong Lee
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.4
    • /
    • pp.701-708
    • /
    • 2024
  • The advent of transformer architectures in deep learning has been a major breakthrough in natural language processing research. Object name recognition is a branch of natural language processing and is an important research area for tasks such as information retrieval. It is also important in the biomedical field, but the lack of Korean biomedical corpora for training has limited the development of Korean clinical research using AI. In this study, we built a new biomedical corpus for Korean biomedical entity name recognition and selected language models pre-trained on a large Korean corpus for transfer learning. We compared the name recognition performance of the selected language models by F1-score and the recognition rate by tag, and analyzed the errors. In terms of recognition performance, KlueRoBERTa showed relatively good performance. The error analysis of the tagging process shows that the recognition performance of Disease is excellent, but Body and Treatment are relatively low. This is due to over-segmentation and under-segmentation that fails to properly categorize entity names based on context, and it will be necessary to build a more precise morphological analyzer and a rich lexicon to compensate for the incorrect tagging.

Korean-Chinese Person Name Translation for Cross Language Information Retrieval

  • Wang, Yu-Chun;Lee, Yi-Hsun;Lin, Chu-Cheng;Tsai, Richard Tzong-Han;Hsu, Wen-Lian
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2007.11a
    • /
    • pp.489-497
    • /
    • 2007
  • Named entity translation plays an important role in many applications, such as information retrieval and machine translation. In this paper, we focus on translating person names, the most common type of name entity in Korean-Chinese cross language information retrieval (KCIR). Unlike other languages, Chinese uses characters (ideographs), which makes person name translation difficult because one syllable may map to several Chinese characters. We propose an effective hybrid person name translation method to improve the performance of KCIR. First, we use Wikipedia as a translation tool based on the inter-language links between the Korean edition and the Chinese or English editions. Second, we adopt the Naver people search engine to find the query name's Chinese or English translation. Third, we extract Korean-English transliteration pairs from Google snippets, and then search for the English-Chinese transliteration in the database of Taiwan's Central News Agency or in Google. The performance of KCIR using our method is over five times better than that of a dictionary-based system. The mean average precision is 0.3490 and the average recall is 0.7534. The method can deal with Chinese, Japanese, Korean, as well as non-CJK person name translation from Korean to Chinese. Hence, it substantially improves the performance of KCIR.

  • PDF

Development of Tourism Information Named Entity Recognition Datasets for the Fine-tune KoBERT-CRF Model

  • Jwa, Myeong-Cheol;Jwa, Jeong-Woo
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.14 no.2
    • /
    • pp.55-62
    • /
    • 2022
  • A smart tourism chatbot is needed as a user interface to efficiently provide smart tourism services such as recommended travel products, tourist information, my travel itinerary, and tour guide service to tourists. We have been developed a smart tourism app and a smart tourism information system that provide smart tourism services to tourists. We also developed a smart tourism chatbot service consisting of khaiii morpheme analyzer, rule-based intention classification, and tourism information knowledge base using Neo4j graph database. In this paper, we develop the Korean and English smart tourism Name Entity (NE) datasets required for the development of the NER model using the pre-trained language models (PLMs) for the smart tourism chatbot system. We create the tourism information NER datasets by collecting source data through smart tourism app, visitJeju web of Jeju Tourism Organization (JTO), and web search, and preprocessing it using Korean and English tourism information Name Entity dictionaries. We perform training on the KoBERT-CRF NER model using the developed Korean and English tourism information NER datasets. The weight-averaged precision, recall, and f1 scores are 0.94, 0.92 and 0.94 on Korean and English tourism information NER datasets.