• Title/Summary/Keyword: Language and Knowledge Engineering

Search Result 233, Processing Time 0.028 seconds

A Study on Knowledge Entity Extraction Method for Individual Stocks Based on Neural Tensor Network (뉴럴 텐서 네트워크 기반 주식 개별종목 지식개체명 추출 방법에 관한 연구)

  • Yang, Yunseok;Lee, Hyun Jun;Oh, Kyong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.25-38
    • /
    • 2019
  • Selecting high-quality information that meets the interests and needs of users among the overflowing contents is becoming more important as the generation continues. In the flood of information, efforts to reflect the intention of the user in the search result better are being tried, rather than recognizing the information request as a simple string. Also, large IT companies such as Google and Microsoft focus on developing knowledge-based technologies including search engines which provide users with satisfaction and convenience. Especially, the finance is one of the fields expected to have the usefulness and potential of text data analysis because it's constantly generating new information, and the earlier the information is, the more valuable it is. Automatic knowledge extraction can be effective in areas where information flow is vast, such as financial sector, and new information continues to emerge. However, there are several practical difficulties faced by automatic knowledge extraction. First, there are difficulties in making corpus from different fields with same algorithm, and it is difficult to extract good quality triple. Second, it becomes more difficult to produce labeled text data by people if the extent and scope of knowledge increases and patterns are constantly updated. Third, performance evaluation is difficult due to the characteristics of unsupervised learning. Finally, problem definition for automatic knowledge extraction is not easy because of ambiguous conceptual characteristics of knowledge. So, in order to overcome limits described above and improve the semantic performance of stock-related information searching, this study attempts to extract the knowledge entity by using neural tensor network and evaluate the performance of them. Different from other references, the purpose of this study is to extract knowledge entity which is related to individual stock items. Various but relatively simple data processing methods are applied in the presented model to solve the problems of previous researches and to enhance the effectiveness of the model. From these processes, this study has the following three significances. First, A practical and simple automatic knowledge extraction method that can be applied. Second, the possibility of performance evaluation is presented through simple problem definition. Finally, the expressiveness of the knowledge increased by generating input data on a sentence basis without complex morphological analysis. The results of the empirical analysis and objective performance evaluation method are also presented. The empirical study to confirm the usefulness of the presented model, experts' reports about individual 30 stocks which are top 30 items based on frequency of publication from May 30, 2017 to May 21, 2018 are used. the total number of reports are 5,600, and 3,074 reports, which accounts about 55% of the total, is designated as a training set, and other 45% of reports are designated as a testing set. Before constructing the model, all reports of a training set are classified by stocks, and their entities are extracted using named entity recognition tool which is the KKMA. for each stocks, top 100 entities based on appearance frequency are selected, and become vectorized using one-hot encoding. After that, by using neural tensor network, the same number of score functions as stocks are trained. Thus, if a new entity from a testing set appears, we can try to calculate the score by putting it into every single score function, and the stock of the function with the highest score is predicted as the related item with the entity. To evaluate presented models, we confirm prediction power and determining whether the score functions are well constructed by calculating hit ratio for all reports of testing set. As a result of the empirical study, the presented model shows 69.3% hit accuracy for testing set which consists of 2,526 reports. this hit ratio is meaningfully high despite of some constraints for conducting research. Looking at the prediction performance of the model for each stocks, only 3 stocks, which are LG ELECTRONICS, KiaMtr, and Mando, show extremely low performance than average. this result maybe due to the interference effect with other similar items and generation of new knowledge. In this paper, we propose a methodology to find out key entities or their combinations which are necessary to search related information in accordance with the user's investment intention. Graph data is generated by using only the named entity recognition tool and applied to the neural tensor network without learning corpus or word vectors for the field. From the empirical test, we confirm the effectiveness of the presented model as described above. However, there also exist some limits and things to complement. Representatively, the phenomenon that the model performance is especially bad for only some stocks shows the need for further researches. Finally, through the empirical study, we confirmed that the learning method presented in this study can be used for the purpose of matching the new text information semantically with the related stocks.

Disaster Health Literacy of Middle-aged Women

  • Seifi, Bahar;Ghanizadeh, Ghader;Seyedin, Hesam
    • Journal of Menopausal Medicine
    • /
    • v.24 no.3
    • /
    • pp.150-154
    • /
    • 2018
  • As disasters have been increasing in recent years, disaster health literacy is gaining more important for a population such as middle-age women. This is because they face developmental crises (menopause) and situational crisis (disaster). Due to the growing elderly population, it is imperative to seriously consider the issue of aging women's healthcare, and their educational needs relative to emergencies and disasters. The purpose of study was to clarify the importance of disaster health literacy for middle-age women. This study is a review of the literature using PubMed, ScienceDirect, Web of Science, Google Scholar, SCOPUS, OVID, ProQuest, Springer, and Wiley. Data was collected with keywords related to the research topic ("Women's health" OR "Geriatric health") AND ("Health literacy" OR "Disaster health literacy" OR "Disaster prevention literacy" OR "Risk knowledge" OR "Knowledge management") AND ("Disasters" OR "Risk" OR "Crises") in combination with the Boolean-operators OR and AND. We reviewed full text English-language articles published November 2011 November 2017. Additional references were identified from reference lists in targeted publications, review articles and books. This review demonstrated that disaster health literacy is critical for elderly women, because they may suffer from physical and psychological problems triggered by developmental crises such as menopause and situational crises such as disasters. Disaster literacy could enable them to improve resiliency and reduce disaster risk. Education has vital role in health promotion of middle-age women. Policymakers and health managers should be aware of the challenges of elderly women as a vulnerable group in disasters and develop plans to incorporate disaster health literacy for preparedness and prevention in educating this group.

Construction of Korean Wordnet "KorLex 1.5" (한국어 어휘의미망 "KorLex 1.5"의 구축)

  • Yoon, Ae-Sun;Hwang, Soon-Hee;Lee, Eun-Ryoung;Kwon, Hyuk-Chul
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.1
    • /
    • pp.92-108
    • /
    • 2009
  • The Princeton WordNet (PWN), which was developed during last 20 years since the mid 80, aimed at representing a mental lexicon inside the human mind. Its potentiality, applicability and portability were more appreciated in the fields of NLP and KE than in cognitive psychology. The semantic and knowledge processing is indispensable in order to obtain useful information using human languages, in the CMC and HCI environment. The PWN is able to provide such NLP-based systems with 'concrete' semantic units and their network. Referenced to the PWN, about 50 wordnets of different languages were developed during last 10 years and they enable a variety of multilingual processing applications. This paper aims at describing PWN-referenced Korean Wordnet, KorLex 1.5, which was developed from 2004 to 2007, and which contains currently about 130,000 synsets and 150,000 word senses for nouns, verbs, adjectives, adverbs, and classifiers.

Toward Generic, Immersive, and Collaborative Solutions to the Data Interoperability Problem which Target End-Users

  • Sanchez-Ruiz, Arturo;Umapathy, Karthikeyan;Hayes, Pat
    • Journal of Computing Science and Engineering
    • /
    • v.3 no.2
    • /
    • pp.127-141
    • /
    • 2009
  • In this paper, we describe our vision of a "Just-in-time" initiative to solve the Data Interoperability Problem (a.k.a. INTEROP.) We provide an architectural overview of our initiative which draws upon existing technologies to develop an immersive and collaborative approach which aims at empowering data stakeholders (e.g., data producers and data consumers) with integrated tools to interact and collaborate with each other while directly manipulating visual representations of their data in an immersive environment (e.g., implemented via Second Life.) The semantics of these visual representations and the operations associated with the data are supported by ontologies defined using the Common Logic Framework (CL). Data operations gestured by the stakeholders, through their avatars, are translated to a variety of generated resources such as multi-language source code, visualizations, web pages, and web services. The generality of the approach is supported by a plug-in architecture which allows expert users to customize tasks such as data admission, data manipulation in the immersive world, and automatic generation of resources. This approach is designed with a mindset aimed at enabling stakeholders from diverse domains to exchange data and generate new knowledge.

Development and Implementation of Training Program for Information System Design Using Material Requirements Planning

  • Yamazaki, Tomoaki;Yin, Rui;Kawaguchi, Seisuke;Hayasaka, Hirotatsu;Matsumoto, Toshiyuki;Ichikizaki, Osamu;Kanazawa, Takashi
    • Industrial Engineering and Management Systems
    • /
    • v.11 no.3
    • /
    • pp.255-265
    • /
    • 2012
  • Environments surrounding production sites have changed greatly in recent years. Accommodating environmental changes calls for the design and development of information systems that center on production lines. There is a need for a training program that teaches learners to understand the particulars of an operation and apply that knowledge to an information system. In this research, we used material requirements planning (MRP) as the subject for which basic skills are to be taught and developed an MRP exercise-based training program. The program is designed for 13 lectures of 90 minutes each, and it consists of MRP exercises, modeling methods to represent them, the use of a programming language for system development, and finally, evaluation of the exercises. Lecture materials are described in 505 lecture slides using Microsoft PowerPoint to allow visualization of topics through graphs and models. The developed training program was then delivered to 86 college students, and its results were measured through quizzes to verify educational effectiveness.

Frequency and Social Network Analysis of the Bible Data using Big Data Analytics Tools R (R을 이용한 성경 데이터의 빈도와 소셜 네트워크 분석)

  • Ban, ChaeHoon;Ha, JongSoo
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2018.10a
    • /
    • pp.93-96
    • /
    • 2018
  • Big datatics technology that can store and analyze data and obtain new knowledge has been adjusted for importance in many fields of the society. Big data is emerging as an important problem in the field of information and communication technology, but the mind of continuous technology is rising. R, a tool that can analyze big data, is a language and environment that enables information analysis of statistical bases. In this thesis, we use this to analyze the Bible data. R is used to investigate the frequency of what text is distributed and analyze the Bible through analysis of social network.

  • PDF

uCDSS: Development of an Intelligent System for Ubiquitous Healthcare

  • An, Hyeon-Sun;Kim, Gwan-Yu;Lee, Seung-Han;Choe, Si-Myeong;Jo, Man-Jae;Lee, Sang-Gyeong;Kim, Jin-Tae
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2005.11a
    • /
    • pp.425-428
    • /
    • 2005
  • Healthcare is a research field suitable for applying the recent ubiquitous techniques. As a test system, we developed a kind of CDSS (Clinical Decision Support System) running in ubiquitous environment. called as 'uCDSS'. The uCDSS is a core system of the ubiquitous healthcare and is composed of some 'uMLMs(Ubiquitous Medical Logic Modules)'. The uMLMs based on the class in C# programming language could be reused in development of CDSS, or another EHR system running in .NET environment. As a test system, we developed the DM(Diabetes Mellitus knowledge system using ASP.NET. This system shows the potential of C# class-based uMLMs and the extensibility to any .NET development project.

  • PDF

A Computation Study of Prosodic Structures of Korean for Speech Recognition and Synthesis:Predicting Phonological Boundaries (음성인식.합성을 위한 한국어 운율단위 음운론의 계산적 연구:음운단위에 따른 경계의 발견)

  • Lee, Chan-Do
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.1
    • /
    • pp.280-287
    • /
    • 1997
  • The introduction of phonological knowledge, prosodic information to speech recognition and synthesis systems is very important to build successful spoken language systems. First, related works of computational phonology is overviewed and the theoretical and experimental studies of prosodic structures and boundaries in Korean are summarized. The main focus of this study is to decide which prosodic phrasing trained on a simple recurrent network. The results show information other than phonetic features. This method can be combined with other useful information to predict the boundaries more correctly and to help segmentation, which are vital for the successful speech recognition and synthesis systems.

  • PDF

A Study on the Types of the Associative Relationship in Thesauri (시소러스의 연관관계 유형에 관한 연구)

  • Jun, Mal-Suk
    • Journal of Information Management
    • /
    • v.29 no.1
    • /
    • pp.20-39
    • /
    • 1998
  • In order to index documents, a thesaurus which consists of terms and relationships between terms is used. When an index term is selected, retrieval performance in the information retrieval system could be improved by using the relationship between the terms in the thesaurus. Recently, the usage of a thesaurus are extended from information retrieval to language and knowledge engineering, but term relationships in a thesaurus are simply represented in equivalence, hierarchy, and association. Particularly the associative relationship is vague in its definition and range as compared with the other relationships, i.e. equivalence, hierarchy, therefore the terms that are selected through associative relationship aren't well controlled. This study examines the relationships of existing thesauri, especially the types and ranges of associative relationship, and suggests the adequate type of associative relationship.

  • PDF

Citing Behavior of Korean Scientists on Foreign Journals in KSCD (KSCD를 활용한 국내 과학기술자의 해외 학술지 인용행태 연구)

  • Kim, Byung-Kyu;Kang, Mu-Yeong;Choi, Seon-Heui;Kim, Soon-Young;You, Beom-Jong;Shin, Jae-Do
    • Journal of the Korean Society for information Management
    • /
    • v.28 no.2
    • /
    • pp.117-133
    • /
    • 2011
  • There have been little comprehensive research for studying impact of foreign journals on Korean scientists. The main reason for this is because there was no extensive citation index database of domestic journals for analysis. Korea Institute of Science and Technology Information (KISTI) built the Korea Science Citation Database (KSCD), and have provided Korea Science Citation Index (KSCI) and Korea Journal Citation Reports (KJCR) services. In this article, citing behavior of Korean scientists on foreign journals was examined by using KSCD that covers Korean core journals. This research covers (1) analysis of foreign document types cited, (2) analysis of citation counts of foreign journals by subject and the ratio of citing different disciplines, (3) analysis of language and country of foreign documents cited, (4) analysis of publishers of journals and whether or not journals are listed on global citation index services and (5) analysis for current situation of subscribing to foreign electronic journals in Korea. The results of this research would be useful for establishing strategies for licensing foreign electronic journals and for information services. From this research, immediacy citation rate (average 1.46%), peak-time (average 3.9 years) and half-life (average 8 years) of cited foreign journals were identified. It was also found that Korean scientistis tend to cite journals covered in SCI(E) or SCOPUS, and 90% of cited foreign journals have been licensed by institutions in Korea.