• Title/Summary/Keyword: Knowledge extraction

Search Result 384, Processing Time 0.035 seconds

Effects of Educational Content for Dental Extraction Using Virtual Reality Technology on Dental Extraction Knowledge, Skill and Class Satisfaction (가상현실 기술을 활용한 치아발치 교육콘텐츠가 치아발치에 관한 지식, 수행능력 및 실습만족도에 미치는 효과)

  • Park, Jong-Tae;Kim, Ji Hyo;Kim, Moon Young;Lee, Jeong Hyun
    • The Journal of the Korea Contents Association
    • /
    • v.19 no.2
    • /
    • pp.650-660
    • /
    • 2019
  • The purpose of this study is to verify the effect of the contents of tooth extraction education using VR(virtual reality) on knowledge, performance and satisfaction of practicing teeth. To accomplish this purpose, we divided 72 dental students into the two groups: The experimental group 30 dental students using VR based tooth extraction training contents. And the control group consisted of 42 participants. As a first result for the study, The experimental group using VR tooth extraction content and control group receiving the training using the tooth model based content showed no statistically significant difference in tooth extraction knowledge. Second, in the case of performance of tooth extraction (before tooth extraction-tooth extraction-after tooth extraction-finishing), the training group using VR tooth extraction contents was higher than the control group. Third, The satisfaction rate of the practice group using VR tooth extraction education contents was higher than in the control group. Therefore, it can be seen that the practical training using the VR applied tooth extraction contents improves the performance of the tooth extraction and the satisfaction of the practice more than the existing practice method.

A Study on Conversational AI Agent based on Continual Learning

  • Chae-Lim, Park;So-Yeop, Yoo;Ok-Ran, Jeong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.1
    • /
    • pp.27-38
    • /
    • 2023
  • In this paper, we propose a conversational AI agent based on continual learning that can continuously learn and grow with new data over time. A continual learning-based conversational AI agent consists of three main components: Task manager, User attribute extraction, and Auto-growing knowledge graph. When a task manager finds new data during a conversation with a user, it creates a new task with previously learned knowledge. The user attribute extraction model extracts the user's characteristics from the new task, and the auto-growing knowledge graph continuously learns the new external knowledge. Unlike the existing conversational AI agents that learned based on a limited dataset, our proposed method enables conversations based on continuous user attribute learning and knowledge learning. A conversational AI agent with continual learning technology can respond personally as conversations with users accumulate. And it can respond to new knowledge continuously. This paper validate the possibility of our proposed method through experiments on performance changes in dialogue generation models over time.

Development of Information Extraction System from Multi Source Unstructured Documents for Knowledge Base Expansion (지식베이스 확장을 위한 멀티소스 비정형 문서에서의 정보 추출 시스템의 개발)

  • Choi, Hyunseung;Kim, Mintae;Kim, Wooju;Shin, Dongwook;Lee, Yong Hun
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.111-136
    • /
    • 2018
  • In this paper, we propose a methodology to extract answer information about queries from various types of unstructured documents collected from multi-sources existing on web in order to expand knowledge base. The proposed methodology is divided into the following steps. 1) Collect relevant documents from Wikipedia, Naver encyclopedia, and Naver news sources for "subject-predicate" separated queries and classify the proper documents. 2) Determine whether the sentence is suitable for extracting information and derive the confidence. 3) Based on the predicate feature, extract the information in the proper sentence and derive the overall confidence of the information extraction result. In order to evaluate the performance of the information extraction system, we selected 400 queries from the artificial intelligence speaker of SK-Telecom. Compared with the baseline model, it is confirmed that it shows higher performance index than the existing model. The contribution of this study is that we develop a sequence tagging model based on bi-directional LSTM-CRF using the predicate feature of the query, with this we developed a robust model that can maintain high recall performance even in various types of unstructured documents collected from multiple sources. The problem of information extraction for knowledge base extension should take into account heterogeneous characteristics of source-specific document types. The proposed methodology proved to extract information effectively from various types of unstructured documents compared to the baseline model. There is a limitation in previous research that the performance is poor when extracting information about the document type that is different from the training data. In addition, this study can prevent unnecessary information extraction attempts from the documents that do not include the answer information through the process for predicting the suitability of information extraction of documents and sentences before the information extraction step. It is meaningful that we provided a method that precision performance can be maintained even in actual web environment. The information extraction problem for the knowledge base expansion has the characteristic that it can not guarantee whether the document includes the correct answer because it is aimed at the unstructured document existing in the real web. When the question answering is performed on a real web, previous machine reading comprehension studies has a limitation that it shows a low level of precision because it frequently attempts to extract an answer even in a document in which there is no correct answer. The policy that predicts the suitability of document and sentence information extraction is meaningful in that it contributes to maintaining the performance of information extraction even in real web environment. The limitations of this study and future research directions are as follows. First, it is a problem related to data preprocessing. In this study, the unit of knowledge extraction is classified through the morphological analysis based on the open source Konlpy python package, and the information extraction result can be improperly performed because morphological analysis is not performed properly. To enhance the performance of information extraction results, it is necessary to develop an advanced morpheme analyzer. Second, it is a problem of entity ambiguity. The information extraction system of this study can not distinguish the same name that has different intention. If several people with the same name appear in the news, the system may not extract information about the intended query. In future research, it is necessary to take measures to identify the person with the same name. Third, it is a problem of evaluation query data. In this study, we selected 400 of user queries collected from SK Telecom 's interactive artificial intelligent speaker to evaluate the performance of the information extraction system. n this study, we developed evaluation data set using 800 documents (400 questions * 7 articles per question (1 Wikipedia, 3 Naver encyclopedia, 3 Naver news) by judging whether a correct answer is included or not. To ensure the external validity of the study, it is desirable to use more queries to determine the performance of the system. This is a costly activity that must be done manually. Future research needs to evaluate the system for more queries. It is also necessary to develop a Korean benchmark data set of information extraction system for queries from multi-source web documents to build an environment that can evaluate the results more objectively.

Linking Korean Predicates to Knowledge Base Properties (한국어 서술어와 지식베이스 프로퍼티 연결)

  • Won, Yousung;Woo, Jongseong;Kim, Jiseong;Hahm, YoungGyun;Choi, Key-Sun
    • Journal of KIISE
    • /
    • v.42 no.12
    • /
    • pp.1568-1574
    • /
    • 2015
  • Relation extraction plays a role in for the process of transforming a sentence into a form of knowledge base. In this paper, we focus on predicates in a sentence and aim to identify the relevant knowledge base properties required to elucidate the relationship between entities, which enables a computer to understand the meaning of a sentence more clearly. Distant Supervision is a well-known approach for relation extraction, and it performs lexicalization tasks for knowledge base properties by generating a large amount of labeled data automatically. In other words, the predicate in a sentence will be linked or mapped to the possible properties which are defined by some ontologies in the knowledge base. This lexical and ontological linking of information provides us with a way of generating structured information and a basis for enrichment of the knowledge base.

The Component Extraction Using Knowledge-Base from Name-Card (명함에서 지식베이스를 이용한 구성요소의 추출)

  • 이성범;남궁재찬
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.8
    • /
    • pp.1201-1212
    • /
    • 1993
  • This paper presents the automatically extracting method of data item from name-cards using knowledge-base. In our approach, we utilize a structural information and a relational information between data items and elements with knowledge in the name-cards. To describe a hierarchical knowledge, we uses a flame structure and we propose an algorithim of domain classification to extract item and group candidate domains from the name-cards. From the experimental results, we obtain the extraction rate, 95%, for 100 samples.

  • PDF

Environment for Translation Domain Adaptation and Continuous Improvement of English-Korean Machine Translation System

  • Kim, Sung-Dong;Kim, Namyun
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.12 no.2
    • /
    • pp.127-136
    • /
    • 2020
  • This paper presents an environment for rule-based English-Korean machine translation system, which supports the translation domain adaptation and the continuous translation quality improvement. For the purposes, corpus is essential, from which necessary information for translation will be acquired. The environment consists of a corpus construction part and a translation knowledge extraction part. The corpus construction part crawls news articles from some newspaper sites. The extraction part builds the translation knowledge such as newly-created words, compound words, collocation information, distributional word representations, and so on. For the translation domain adaption, the corpus for the domain should be built and the translation knowledge should be constructed from the corpus. For the continuous improvement, corpus needs to be continuously expanded and the translation knowledge should be enhanced from the expanded corpus. The proposed web-based environment is expected to facilitate the tasks of domain adaptation and translation system improvement.

A Study on the Extraction of Knowledge for Image Understanding (영상이해를 위한 지식유출에 관한 연구)

  • 곽윤식;이대영
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.5
    • /
    • pp.757-772
    • /
    • 1993
  • This paper describes the knowledge extraction for image understanding in knowledge based system. The current set of low level processes operate on the numerical pixel arrays, to segment the image into region and to convert the image into directional image, and to calculate feature for these regions. The current set of intermedate level processes operate on the results of earlier knowledge source to build more complex representations of the data. We have grouped into thee categories : feature based classification, geometric token relation, perceptual organization and grouping.

  • PDF

Hybrid Intelligent Web Recommendation Systems Based on Web Data Mining and Case-Based Reasoning

  • Kim, Jin-Sung
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.13 no.3
    • /
    • pp.366-370
    • /
    • 2003
  • In this research, we suggest a hybrid intelligent Web recommendation systems based on Web data mining and case-based reasoning (CBR). One of the important research topics in the field of Internet business is blending artificial intelligence (AI) techniques with knowledge discovering in database (KDD) or data mining (DM). Data mining is used as an efficient mechanism in reasoning for association knowledge between goods and customers' preference. In the field of data mining, the features, called attributes, are often selected primary for mining the association knowledge between related products. Therefore, most of researches, in the arena of Web data mining, used association rules extraction mechanism. However, association rules extraction mechanism has a potential limitation in flexibility of reasoning. If there are some goods, which were not retrieved by association rules-based reasoning, we can't present more information to customer. To overcome this limitation case, we combined CBR with Web data mining. CBR is one of the AI techniques and used in problems for which it is difficult to solve with logical (association) rules. A Web-log data gathered in real-world Web shopping mall was given to illustrate the quality of the proposed hybrid recommendation mechanism. This Web shopping mall deals with remote-controlled plastic models such as remote-controlled car, yacht, airplane, and helicopter. The experimental results showed that our hybrid recommendation mechanism could reflect both association knowledge and implicit human knowledge extracted from cases in Web databases.

BIOLOGY ORIENTED TARGET SPECIFIC LITERATURE MINING FOR GPCR PATHWAY EXTRACTION (GPCR 경로 추출을 위한 생물학 기반의 목적지향 텍스트 마이닝 시스템)

  • KIm, Eun-Ju;Jung, Seol-Kyoung;Yi, Eun-Ji;Lee, Gary-Geunbae;Park, Soo-Jun
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2003.10a
    • /
    • pp.86-94
    • /
    • 2003
  • Electronically available biological literature has been accumulated exponentially in the course of time. So, researches on automatically acquiring knowledge from these tremendous data by text mining technology become more and more prosperous. However, most of the previous researches are technology oriented and are not well focused in practical extraction target, hence result in low performance and inconvenience for the bio-researchers to actually use. In this paper, we propose a more biology oriented target domain specific text mining system, that is, POSTECH bio-text mining system (POSBIOTM), for signal transduction pathway extraction, especially for G protein-coupled receptor (GPCR) pathway. To reflect more domain knowledge, we specify the concrete target for pathway extraction and define the minimal pathway domain ontology. Under this conceptual model, POSBIOTM extracts interactions and entities of pathways from the full biological articles using a machine learning oriented extraction method and visualizes the pathways using JDesigner module provided in the system biology workbench (SBW) [14]

  • PDF

Improving accessibility and distinction between negative results in biomedical relation extraction

  • Sousa, Diana;Lamurias, Andre;Couto, Francisco M.
    • Genomics & Informatics
    • /
    • v.18 no.2
    • /
    • pp.20.1-20.4
    • /
    • 2020
  • Accessible negative results are relevant for researchers and clinicians not only to limit their search space but also to prevent the costly re-exploration of research hypotheses. However, most biomedical relation extraction datasets do not seek to distinguish between a false and a negative relation among two biomedical entities. Furthermore, datasets created using distant supervision techniques also have some false negative relations that constitute undocumented/ unknown relations (missing from a knowledge base). We propose to improve the distinction between these concepts, by revising a subset of the relations marked as false on the phenotype-gene relations corpus and give the first steps to automatically distinguish between the false (F), negative (N), and unknown (U) results. Our work resulted in a sample of 127 manually annotated FNU relations and a weighted-F1 of 0.5609 for their automatic distinction. This work was developed during the 6th Biomedical Linked Annotation Hackathon (BLAH6).