• Title/Summary/Keyword: semantic relation

Search Result 233, Processing Time 0.025 seconds

Design of WWW IR System Based on Keyword Clustering Architecture (색인어 말뭉치 처리를 기반으로 한 웹 정보검색 시스템의 설계)

  • 송점동;이정현;최준혁
    • The Journal of Information Technology
    • /
    • v.1 no.1
    • /
    • pp.13-26
    • /
    • 1998
  • In general Information retrieval systems, improper keywords are often extracted and different search results are offered comparing to user's aim bacause the systems use only term frequency informations for selecting keywords and don't consider their meanings. It represents that improving precision is limited without considering semantics of keywords because recall ratio and precision have inverse proportion relation. In this paper, a system which is able to improve precision without decreasing recall ratio is designed and implemented, as client user module is introduced which can send feedbacks to server with user's intention. For this purpose, keywords are selected using relative term frequency and inverse document frequency and co-occurrence words are extracted from original documents. Then, the keywords are clustered by their semantics using calculated mutual informations. In this paper, the system can reject inappropriate documents using segmented semantic informations according to feedbacks from client user module. Consequently precision of the system is improved without decreasing recall ratio.

  • PDF

TAKES: Two-step Approach for Knowledge Extraction in Biomedical Digital Libraries

  • Song, Min
    • Journal of Information Science Theory and Practice
    • /
    • v.2 no.1
    • /
    • pp.6-21
    • /
    • 2014
  • This paper proposes a novel knowledge extraction system, TAKES (Two-step Approach for Knowledge Extraction System), which integrates advanced techniques from Information Retrieval (IR), Information Extraction (IE), and Natural Language Processing (NLP). In particular, TAKES adopts a novel keyphrase extraction-based query expansion technique to collect promising documents. It also uses a Conditional Random Field-based machine learning technique to extract important biological entities and relations. TAKES is applied to biological knowledge extraction, particularly retrieving promising documents that contain Protein-Protein Interaction (PPI) and extracting PPI pairs. TAKES consists of two major components: DocSpotter, which is used to query and retrieve promising documents for extraction, and a Conditional Random Field (CRF)-based entity extraction component known as FCRF. The present paper investigated research problems addressing the issues with a knowledge extraction system and conducted a series of experiments to test our hypotheses. The findings from the experiments are as follows: First, the author verified, using three different test collections to measure the performance of our query expansion technique, that DocSpotter is robust and highly accurate when compared to Okapi BM25 and SLIPPER. Second, the author verified that our relation extraction algorithm, FCRF, is highly accurate in terms of F-Measure compared to four other competitive extraction algorithms: Support Vector Machine, Maximum Entropy, Single POS HMM, and Rapier.

A Study on Legal Ontology Construction (법령 온톨로지 구축에 관한 연구)

  • Jo, Dae Woong;Kim, Myung Ho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.11
    • /
    • pp.105-113
    • /
    • 2014
  • In this paper, we propose an OWL DL mapping rules for construction legal ontology based on the analyzed relationship between the structural features and elements of the statute. The mapping rule to be proposed is the method building the structure of the domestic statute, unique attribute of the statute, and reference relation between laws with TBox, and the legal sentence is analyzed, and the pattern type of the sentence is selected. It expresses with ABox. The proposed mapping rule is transformed to the information in which the computer can process the domestic legal document. It is usable for the legal knowledge base.

The Effect of Word Frequency on Noun Definitions (단어빈도가 명사정의하기에 미치는 효과)

  • Lee, Chan-Jong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.6
    • /
    • pp.303-308
    • /
    • 2008
  • The purpose of the present study is to investigate that word frequency has significant influence on noun definitions in Korean. The experimental group was 80 students from Elementary school, Middle school, High school and University. They rated familiarity and wrote definitions for nouns. Noun definitions were analyzed with semantic categories such as "use/purpose," "description," "association/relation," "partial explanation," "explanation," "error," "partial explanation-attribute," "partial explanation-specific class," "partial explanation-nonspecific class," "explanation-specific class," "explanation-nonspecific class." As a result, they showed familiarity for high-frequency nouns. "EXPL" categories that use class terms or critical attributes were used more frequently in definitions of high-frequency nouns compared with low-frequency nouns. They increased with age and errors decreased with age. Word frequency had a significant influence on noun definitions.

Analysis of News Agenda Using Text mining and Semantic Network Analysis: Focused on COVID-19 Emotions (텍스트 마이닝과 의미 네트워크 분석을 활용한 뉴스 의제 분석: 코로나 19 관련 감정을 중심으로)

  • Yoo, So-yeon;Lim, Gyoo-gun
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.47-64
    • /
    • 2021
  • The global spread of COVID-19 around the world has not only affected many parts of our daily life but also has a huge impact on many areas, including the economy and society. As the number of confirmed cases and deaths increases, medical staff and the public are said to be experiencing psychological problems such as anxiety, depression, and stress. The collective tragedy that accompanies the epidemic raises fear and anxiety, which is known to cause enormous disruptions to the behavior and psychological well-being of many. Long-term negative emotions can reduce people's immunity and destroy their physical balance, so it is essential to understand the psychological state of COVID-19. This study suggests a method of monitoring medial news reflecting current days which requires striving not only for physical but also for psychological quarantine in the prolonged COVID-19 situation. Moreover, it is presented how an easier method of analyzing social media networks applies to those cases. The aim of this study is to assist health policymakers in fast and complex decision-making processes. News plays a major role in setting the policy agenda. Among various major media, news headlines are considered important in the field of communication science as a summary of the core content that the media wants to convey to the audiences who read it. News data used in this study was easily collected using "Bigkinds" that is created by integrating big data technology. With the collected news data, keywords were classified through text mining, and the relationship between words was visualized through semantic network analysis between keywords. Using the KrKwic program, a Korean semantic network analysis tool, text mining was performed and the frequency of words was calculated to easily identify keywords. The frequency of words appearing in keywords of articles related to COVID-19 emotions was checked and visualized in word cloud 'China', 'anxiety', 'situation', 'mind', 'social', and 'health' appeared high in relation to the emotions of COVID-19. In addition, UCINET, a specialized social network analysis program, was used to analyze connection centrality and cluster analysis, and a method of visualizing a graph using Net Draw was performed. As a result of analyzing the connection centrality between each data, it was found that the most central keywords in the keyword-centric network were 'psychology', 'COVID-19', 'blue', and 'anxiety'. The network of frequency of co-occurrence among the keywords appearing in the headlines of the news was visualized as a graph. The thickness of the line on the graph is proportional to the frequency of co-occurrence, and if the frequency of two words appearing at the same time is high, it is indicated by a thick line. It can be seen that the 'COVID-blue' pair is displayed in the boldest, and the 'COVID-emotion' and 'COVID-anxiety' pairs are displayed with a relatively thick line. 'Blue' related to COVID-19 is a word that means depression, and it was confirmed that COVID-19 and depression are keywords that should be of interest now. The research methodology used in this study has the convenience of being able to quickly measure social phenomena and changes while reducing costs. In this study, by analyzing news headlines, we were able to identify people's feelings and perceptions on issues related to COVID-19 depression, and identify the main agendas to be analyzed by deriving important keywords. By presenting and visualizing the subject and important keywords related to the COVID-19 emotion at a time, medical policy managers will be able to be provided a variety of perspectives when identifying and researching the regarding phenomenon. It is expected that it can help to use it as basic data for support, treatment and service development for psychological quarantine issues related to COVID-19.

A Comparison of the Freshmen's Cognitive Frame about the 'Crisis of Earth' ('위기의 지구'에 대한 인지프레임 비교: 대학교 신입생들 대상으로)

  • Chung, Duk Ho;Choi, Hyeon A;Park, Seon Ok
    • Journal of the Korean earth science society
    • /
    • v.37 no.2
    • /
    • pp.117-131
    • /
    • 2016
  • The purpose of this study was to compare of freshmen's cognitive frames about the 'Crisis of the Earth' upon taking the Earth science I course in high school to confirm if they reflect the goal of the curriculum reasonably. Data were collected from 67 freshmen who graduated from high school. All participants were asked to express about the 'Crisis of the Earth' in painting with explanation, then we picked meaningful units from paintings. We analyzed the words and frames presented in the paintings using the Semantic Network Analysis. Result are as follows. First, when both groups' (one that took the course vs. the other that did not take it) built their cognitive frames for the 'crisis of the Earth', they reasonably connected areas that are composed of the global environment and they understood that their relation was constantly changing by interacting each other. Second, when configuring a cognitive frame about the 'crisis of the Earth', both groups reflected the characteristics of interrelationship with human activities. In particular, the group that took the course of Earth Science I fully reflected the goal of the curriculum. It is suggested that students recognize the 'crisis of the Earth' not only from a cosmic perspective bot also from the Earth's interior since most of students have strongly connected it to the phenomenon of the Earth's interior rather than the Earth's outward symptoms. In addition, it is recommended that the Earth science curriculum put more emphasis on understanding the importance of problem-solving of the Earth's crisis.

Perception of the Gifted Science Students' Mothers on Giftedness (과학영재를 둔 어머니들의 영재성에 대한 인식)

  • Chung, Duk-Ho;Park, Seon-Ok;Yoo, Hyo-Hyun;Park, Jeong-Ju
    • Journal of Gifted/Talented Education
    • /
    • v.24 no.4
    • /
    • pp.561-576
    • /
    • 2014
  • The purpose of this study is to investigate the perception of the mothers of science gifted in respect to giftedness compared to the "Scale for Rating the Behavioral Characteristics of Superior Students-R(SRBCSS-R)". For that, a survey of 18 mothers of elementary school science gifted and 32 mothers of middle school science gifted was conducted in relation to giftedness. The words and frame of this survey were analyzed using the Semantic Network Analysis. The results are as follows : The mothers of Elementary school science gifted perception were found to have a connected giftedness with reading, science, making something, etc.. On the other hand, the mothers of middle school science gifted perception were found to have a connected giftedness with problem, solving problem, mathematics, etc. in words analysis. The mothers of Elementary school science gifted have a strong connection with category on creativity, motivation, etc.. On the other hand, the mothers of middle school science gifted were more inclined towards the category on learning, motivation, etc. in frame analysis. That is to say, the mothers of science gifted are perceptive about giftedness respect to some elements as the "Scale for Rating the Behavioral Characteristics of Superior Students-R" on the giftedness. Therefore, a correct understanding about giftedness in respect to the mothers of science gifted is required and parent education is needed for appropriate science gifted education.

An Algorithm for Translation from RDB Schema Model to XML Schema Model Considering Implicit Referential Integrity (묵시적 참조 무결성을 고려한 관계형 스키마 모델의 XML 스키마 모델 변환 알고리즘)

  • Kim, Jin-Hyung;Jeong, Dong-Won;Baik, Doo-Kwon
    • Journal of KIISE:Databases
    • /
    • v.33 no.5
    • /
    • pp.526-537
    • /
    • 2006
  • The most representative approach for efficient storing of XML data is to store XML data in relational databases. The merit of this approach is that it can easily accept the realistic status that most data are still stored in relational databases. This approach needs to convert XML data into relational data or relational data into XML data. The most important issue in the translation is to reflect structural and semantic relations of RDB to XML schema model exactly. Many studies have been done to resolve the issue, but those methods have several problems: Not cover structural semantics or just support explicit referential integrity relations. In this paper, we propose an algorithm for extracting implicit referential integrities automatically. We also design and implement the suggested algorithm, and execute comparative evaluations using translated XML documents. The proposed algorithm provides several good points such as improving semantic information extraction and conversion, securing sufficient referential integrity of the target databases, and so on. By using the suggested algorithm, we can guarantee not only explicit referential integrities but also implicit referential integrities of the initial relational schema model completely. That is, we can create more exact XML schema model through the suggested algorithm.

A 3-Layered Information Integration System based on MDRs End Ontology (MDR과 온톨로지를 결합한 3계층 정보 통합 시스템)

  • Baik, Doo-Kwon;Choi, Yo-Han;Park, Sung-Kong;Lee, Jeong-Oog;Jeong, Dong-Won
    • The KIPS Transactions:PartD
    • /
    • v.10D no.2
    • /
    • pp.247-260
    • /
    • 2003
  • To share and standardize information, especially in the database environments, MDR (Metadata Registry) can be used to integrate various heterogeneous databases within a particular domain. But due to the discrepancies of data element representation between organizations, global information integration is not so easy. And users who are searching integrated information on the Web have limitation to obtain schema information for the underlying source databases. To solve those problems, in this paper, we present a 3-layered Information Integration System (LI2S) based on MDRs and Ontology. The purpose of proposed architecture is to define information integration model, which combine both of the nature of MDRs standard specification and functionality of ontology for the concept and relation. Adopting agent technology to the proposed model plays a key role to support the hierarchical and independent information integration architecture. Ontology is used as for a role of semantic network from which it extracts concept from the user query and the establishment of relationship between MDRs for the data element. (MDR and Knowledge Base are used as for the solution of discrepancies of data element representation between MDRs. Based on this architectural concept, LI2S was designed and implemented.

Semantic Topic Selection Method of Document for Classification (문서분류를 위한 의미적 주제선정방법)

  • Ko, kwang-Sup;Kim, Pan-Koo;Lee, Chang-Hoon;Hwang, Myung-Gwon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.11 no.1
    • /
    • pp.163-172
    • /
    • 2007
  • The web as global network includes text document, video, sound, etc and connects each distributed information using link Through development of web, it accumulates abundant information and the main is text based documents. Most of user use the web to retrieve information what they want. So, numerous researches have progressed to retrieve the text documents using the many methods, such as probability, statistics, vector similarity, Bayesian, and so on. These researches however, could not consider both the subject and the semantics of documents. As a result user have to find by their hand again. Especially, it is more hard to find the korean document because the researches of korean document classification is insufficient. So, to overcome the previous problems, we propose the korean document classification method for semantic retrieval. This method firstly, extracts TF value and RV value of concepts that is included in document, and maps into U-WIN that is korean vocabulary dictionary to select the topic of document. This method is possible to classify the document semantically and showed the efficiency through experiment.