• Title/Summary/Keyword: Named Entity

Search Result 219, Processing Time 0.021 seconds

A Knowledge Graph of the Korean Financial Crisis of 1997: A Relationship-Oriented Approach to Digital Archives (1997 외환위기 지식그래프: 디지털 아카이브의 관계 중심적 접근)

  • Lee, Yu-kyeong;Kim, Haklae
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.20 no.4
    • /
    • pp.1-17
    • /
    • 2020
  • Along with the development of information technology, the digitalization of archives has also been accelerating. However, digital archives have limitations in effectively searching, interlinking, and understanding records. In response to these issues, this study proposes a knowledge graph that represents comprehensive relationships among heterogeneous entities in digital archives. In this case, the knowledge graph organizes resources in the archives on the Korean financial crisis of 1997 by transforming them into named entities that can be discovered by machines. In particular, the study investigates and creates an overview of the characteristics of the archives on the Korean financial crisis as a digital archive. All resources on the archives are described as entities that have relationships with other entities using semantic vocabularies, such as Records in Contexts-Ontology (RiC-O). Moreover, the knowledge graph of the Korean Financial Crisis of 1997 is represented by resource description framework (RDF) vocabularies, a machine-readable format. Compared to conventional digital archives, the knowledge graph enables users to retrieve a specific entity with its semantic information and discover its relationships with other entities. As a result, the knowledge graph can be used for semantic search and various intelligent services.

Fine-Grained Named Entity Recognition using Conditional Random Fields for Question Answering (Conditional Random Fields를 이용한 세부 분류 개체명 인식)

  • Lee, Chang-Ki;Hwang, Yi-Gyu;Oh, Hyo-Jung;Lim, Soo-Jong;Heo, Jeong;Lee, Chung-Hee;Kim, Hyeon-Jin;Wang, Ji-Hyun;Jang, Myung-Gil
    • Annual Conference on Human and Language Technology
    • /
    • 2006.10e
    • /
    • pp.268-272
    • /
    • 2006
  • 질의응답 시스템은 사용자 질의에 해당하는 정답을 찾기 위해서 세부 분류된 개체명을 사용한다. 이러한 세부 분류 개체명 인식을 위해서 대부분의 시스템이 일반 대분류 개체명인식 후에 사전 등을 이용하여 세부 분류로 나누는 방법을 이용하고 있다. 본 논문에서는 질의응답 시스템을 위한 세부 분류 개체명 인식을 위해서 Conditional Random Fields를 이용한다. 개체명 인식의 과정을 개체명 경계 인식과 경계가 인식된 개체명의 클래스 분류의 두 단계로 나누어, 개체명 경계 인식에 Conditional Random Fields를 이용하고, 경계 인식된 개체명의 클래스 분류에는 Maximum Entropy를 이용한다. 실험결과 147개의 세부분류 개체명 인식에 대해서 정확도 85.8%, 재현률 81.1%. F1=83.4의 성능을 얻었고. baseline model 보다 학습 시간이 27%로 줄고 성능은 증가하였다. 또한 제안된 세부 분류개체명 인식기를 이용하여 질의응답 시스템에 적용한 결과 26%의 성능향상을 보였다.

  • PDF

A Stochastic Bilevel Scheduling Model for the Determination of the Load Shifting and Curtailment in Demand Response Programs

  • Rad, Ali Shayegan;Zangeneh, Ali
    • Journal of Electrical Engineering and Technology
    • /
    • v.13 no.3
    • /
    • pp.1069-1078
    • /
    • 2018
  • Demand response (DR) programs give opportunity to consumers to manage their electricity bills. Besides, distribution system operator (DSO) is interested in using DR programs to obtain technical and economic benefits for distribution network. Since small consumers have difficulties to individually take part in the electricity market, an entity named demand response provider (DRP) has been recently defined to aggregate the DR of small consumers. However, implementing DR programs face challenges to fairly allocate benefits and payments between DRP and DSO. This paper presents a procedure for modeling the interaction between DRP and DSO based on a bilevel programming model. Both DSO and DRP behave from their own viewpoint with different objective functions. On the one hand, DRP bids the potential of DR programs, which are load shifting and load curtailment, to maximize its expected profit and on the other hand, DSO purchases electric power from either the electricity market or DRP to supply its consumers by minimizing its overall cost. In the proposed bilevel programming approach, the upper level problem represents the DRP decisions, while the lower level problem represents the DSO behavior. The obtained bilevel programming problem (BPP) is converted into a single level optimizing problem using its Karush-Kuhn-Tucker (KKT) optimality conditions. Furthermore, point estimate method (PEM) is employed to model the uncertainties of the power demands and the electricity market prices. The efficiency of the presented model is verified through the case studies and analysis of the obtained results.

Development of a CAD Based Tool for the Analysis of Landscape Visibility and Sensitivity (수치지형 해석에 의한 가시성 및 시인성의 경관정보화 연구 - CAD 기반의 분석 도구 개발을 중심으로 -)

  • 조동범
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.26 no.3
    • /
    • pp.78-78
    • /
    • 1998
  • The purpose of this research is to develop a CAD-based program for data analysis of digital elevation model(DEM) on the aspect of landscape assessment. When handling DEM data as a visual simulation of topographic landscape, it is basic interest to analyze visible area and visualize visual sensitivity distributions. In reference with landscape assessment, more intuitive and interactive visualizing tools are needed, specially in area of visual approach. For adaptability to landscape assessment, algorithmic approaches to visibility analysis and concepts for visual sensitivity calculation in this study were based on processing techniques of entity data control functions used in AutoCAD drawing database. Also, for the purpose of quantitative analysis, grid-type 3DFACE entities were adopted as mesh unit of DEM structure. Developed programs are composed of main part named VSI written in AutoLISP and two of interface modules written in dialog control language(DCL0 for user-oriented interactive usage. Definitions of camera points(view points) and target points(or observed area) are available alternatively in combined methods of representing scenic landscape, scenery, and sequential landscape. In the case of scene landscape(single camera to fixed target point), only visibility analysis in available. And total visibility, frequency of cumulative visibility, and visual sensitivity analysis are available in other cases. Visual sensitivity was thought as view angle(3 dimensional observed visual area) and the strengths were classified in user defined level referring to statistical characteristics of distribution. Visibility analysis routine of the VSI was proved to be more effective in the accuracy and time comparing with similar modules of existing AutoCAD third utility.

Korean Semantic Role Labeling Using Semantic Frames and Synonym Clusters (의미 프레임과 유의어 클러스터를 이용한 한국어 의미역 인식)

  • Lim, Soojong;Lim, Joon-Ho;Lee, Chung-Hee;Kim, Hyun-Ki
    • Journal of KIISE
    • /
    • v.43 no.7
    • /
    • pp.773-780
    • /
    • 2016
  • Semantic information and features are very important for Semantic Role Labeling(SRL) though many SRL systems based on machine learning mainly adopt lexical and syntactic features. Previous SRL research based on semantic information is very few because using semantic information is very restricted. We proposed the SRL system which adopts semantic information, such as named entity, word sense disambiguation, filtering adjunct role based on sense, synonym cluster, frame extension based on synonym dictionary and joint rule of syntactic-semantic information, and modified verb-specific numbered roles, etc. According to our experimentations, the proposed present method outperforms those of lexical-syntactic based research works by about 3.77 (Korean Propbank) to 8.05 (Exobrain Corpus) F1-scores.

Linguistic Features Discrimination for Social Issue Risk Classification (사회적 이슈 리스크 유형 분류를 위한 어휘 자질 선별)

  • Oh, Hyo-Jung;Yun, Bo-Hyun;Kim, Chan-Young
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.11
    • /
    • pp.541-548
    • /
    • 2016
  • The use of social media is already essential as a source of information for listening user's various opinions and monitoring. We define social 'risks' that issues effect negative influences for public opinion in social media. This paper aims to discriminate various linguistic features and reveal their effects for building an automatic classification model of social risks. Expecially we adopt a word embedding technique for representation of linguistic clues in risk sentences. As a preliminary experiment to analyze characteristics of individual features, we revise errors in automatic linguistic analysis. At the result, the most important feature is NE (Named Entity) information and the best condition is when combine basic linguistic features. word embedding, and word clusters within core predicates. Experimental results under the real situation in social bigdata - including linguistic analysis errors - show 92.08% and 85.84% in precision respectively for frequent risk categories set and full test set.

LVLN : A Landmark-Based Deep Neural Network Model for Vision-and-Language Navigation (LVLN: 시각-언어 이동을 위한 랜드마크 기반의 심층 신경망 모델)

  • Hwang, Jisu;Kim, Incheol
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.9
    • /
    • pp.379-390
    • /
    • 2019
  • In this paper, we propose a novel deep neural network model for Vision-and-Language Navigation (VLN) named LVLN (Landmark-based VLN). In addition to both visual features extracted from input images and linguistic features extracted from the natural language instructions, this model makes use of information about places and landmark objects detected from images. The model also applies a context-based attention mechanism in order to associate each entity mentioned in the instruction, the corresponding region of interest (ROI) in the image, and the corresponding place and landmark object detected from the image with each other. Moreover, in order to improve the success rate of arriving the target goal, the model adopts a progress monitor module for checking substantial approach to the target goal. Conducting experiments with the Matterport3D simulator and the Room-to-Room (R2R) benchmark dataset, we demonstrate high performance of the proposed model.

Development of the Rule-based Smart Tourism Chatbot using Neo4J graph database

  • Kim, Dong-Hyun;Im, Hyeon-Su;Hyeon, Jong-Heon;Jwa, Jeong-Woo
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.13 no.2
    • /
    • pp.179-186
    • /
    • 2021
  • We have been developed the smart tourism app and the Instagram and YouTube contents to provide personalized tourism information and travel product information to individual tourists. In this paper, we develop a rule-based smart tourism chatbot with the khaiii (Kakao Hangul Analyzer III) morphological analyzer and Neo4J graph database. In the proposed chatbot system, we use a morpheme analyzer, a proper noun dictionary including tourist destination names, and a general noun dictionary including containing frequently used words in tourist information search to understand the intention of the user's question. The tourism knowledge base built using the Neo4J graph database provides adequate answers to tourists' questions. In this paper, the nodes of Neo4J are Area based on tourist destination address, Contents with property of tourist information, and Service including service attribute data frequently used for search. A Neo4J query is created based on the result of analyzing the intention of a tourist's question with the property of nodes and relationships in Neo4J database. An answer to the question is made by searching in the tourism knowledge base. In this paper, we create the tourism knowledge base using more than 1300 Jeju tourism information used in the smart tourism app. We plan to develop a multilingual smart tour chatbot using the named entity recognition (NER), intention classification using conditional random field(CRF), and transfer learning using the pretrained language models.

An Automatically Extracting Formal Information from Unstructured Security Intelligence Report (비정형 Security Intelligence Report의 정형 정보 자동 추출)

  • Hur, Yuna;Lee, Chanhee;Kim, Gyeongmin;Jo, Jaechoon;Lim, Heuiseok
    • Journal of Digital Convergence
    • /
    • v.17 no.11
    • /
    • pp.233-240
    • /
    • 2019
  • In order to predict and respond to cyber attacks, a number of security companies quickly identify the methods, types and characteristics of attack techniques and are publishing Security Intelligence Reports(SIRs) on them. However, the SIRs distributed by each company are huge and unstructured. In this paper, we propose a framework that uses five analytic techniques to formulate a report and extract key information in order to reduce the time required to extract information on large unstructured SIRs efficiently. Since the SIRs data do not have the correct answer label, we propose four analysis techniques, Keyword Extraction, Topic Modeling, Summarization, and Document Similarity, through Unsupervised Learning. Finally, has built the data to extract threat information from SIRs, analysis applies to the Named Entity Recognition (NER) technology to recognize the words belonging to the IP, Domain/URL, Hash, Malware and determine if the word belongs to which type We propose a framework that applies a total of five analysis techniques, including technology.

Considerations for Applying Korean Natural Language Processing Technology in Records Management (기록관리 분야에서 한국어 자연어 처리 기술을 적용하기 위한 고려사항)

  • Haklae, Kim
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.22 no.4
    • /
    • pp.129-149
    • /
    • 2022
  • Records have temporal characteristics, including the past and present; linguistic characteristics not limited to a specific language; and various types categorized in a complex way. Processing records such as text, video, and audio in the life cycle of records' creation, preservation, and utilization entails exhaustive effort and cost. Primary natural language processing (NLP) technologies, such as machine translation, document summarization, named-entity recognition, and image recognition, can be widely applied to electronic records and analog digitization. In particular, Korean deep learning-based NLP technologies effectively recognize various record types and generate record management metadata. This paper provides an overview of Korean NLP technologies and discusses considerations for applying NLP technology in records management. The process of using NLP technologies, such as machine translation and optical character recognition for digital conversion of records, is introduced as an example implemented in the Python environment. In contrast, a plan to improve environmental factors and record digitization guidelines for applying NLP technology in the records management field is proposed for utilizing NLP technology.