Search | Korea Science

Development of Information Extraction System from Multi Source Unstructured Documents for Knowledge Base Expansion (지식베이스 확장을 위한 멀티소스 비정형 문서에서의 정보 추출 시스템의 개발)

Choi, Hyunseung;Kim, Mintae;Kim, Wooju;Shin, Dongwook;Lee, Yong Hun
- Journal of Intelligence and Information Systems
- /
- v.24 no.4
- /
- pp.111-136
- /
- 2018
In this paper, we propose a methodology to extract answer information about queries from various types of unstructured documents collected from multi-sources existing on web in order to expand knowledge base. The proposed methodology is divided into the following steps. 1) Collect relevant documents from Wikipedia, Naver encyclopedia, and Naver news sources for "subject-predicate" separated queries and classify the proper documents. 2) Determine whether the sentence is suitable for extracting information and derive the confidence. 3) Based on the predicate feature, extract the information in the proper sentence and derive the overall confidence of the information extraction result. In order to evaluate the performance of the information extraction system, we selected 400 queries from the artificial intelligence speaker of SK-Telecom. Compared with the baseline model, it is confirmed that it shows higher performance index than the existing model. The contribution of this study is that we develop a sequence tagging model based on bi-directional LSTM-CRF using the predicate feature of the query, with this we developed a robust model that can maintain high recall performance even in various types of unstructured documents collected from multiple sources. The problem of information extraction for knowledge base extension should take into account heterogeneous characteristics of source-specific document types. The proposed methodology proved to extract information effectively from various types of unstructured documents compared to the baseline model. There is a limitation in previous research that the performance is poor when extracting information about the document type that is different from the training data. In addition, this study can prevent unnecessary information extraction attempts from the documents that do not include the answer information through the process for predicting the suitability of information extraction of documents and sentences before the information extraction step. It is meaningful that we provided a method that precision performance can be maintained even in actual web environment. The information extraction problem for the knowledge base expansion has the characteristic that it can not guarantee whether the document includes the correct answer because it is aimed at the unstructured document existing in the real web. When the question answering is performed on a real web, previous machine reading comprehension studies has a limitation that it shows a low level of precision because it frequently attempts to extract an answer even in a document in which there is no correct answer. The policy that predicts the suitability of document and sentence information extraction is meaningful in that it contributes to maintaining the performance of information extraction even in real web environment. The limitations of this study and future research directions are as follows. First, it is a problem related to data preprocessing. In this study, the unit of knowledge extraction is classified through the morphological analysis based on the open source Konlpy python package, and the information extraction result can be improperly performed because morphological analysis is not performed properly. To enhance the performance of information extraction results, it is necessary to develop an advanced morpheme analyzer. Second, it is a problem of entity ambiguity. The information extraction system of this study can not distinguish the same name that has different intention. If several people with the same name appear in the news, the system may not extract information about the intended query. In future research, it is necessary to take measures to identify the person with the same name. Third, it is a problem of evaluation query data. In this study, we selected 400 of user queries collected from SK Telecom 's interactive artificial intelligent speaker to evaluate the performance of the information extraction system. n this study, we developed evaluation data set using 800 documents (400 questions * 7 articles per question (1 Wikipedia, 3 Naver encyclopedia, 3 Naver news) by judging whether a correct answer is included or not. To ensure the external validity of the study, it is desirable to use more queries to determine the performance of the system. This is a costly activity that must be done manually. Future research needs to evaluate the system for more queries. It is also necessary to develop a Korean benchmark data set of information extraction system for queries from multi-source web documents to build an environment that can evaluate the results more objectively.
https://doi.org/10.13088/jiis.2018.24.4.111 인용 PDF KSCI HTML

The Effect of Stress Among Middle School Students and the Effect of Motive on Their Addiction to the Internet (중학생의 스트레스와 인터넷 이용동기가 인터넷 중독에 미치는 영향)

Park, Hea-Young;Lee, Eun-Hee;Park, Sang-Mi
- Journal of Families and Better Life
- /
- v.27 no.6
- /
- pp.65-82
- /
- 2009
The following research aimed to determine the effect of stress among middle school students on their addiction to the Internet. This research's target was a group of male students who had a high probability of getting addicted to the Internet while playing c/t games. The study distributed 357 questionnaires and used 340 copies, which meant discarding 17 copies that were considered inadequate. The research results are as follows: First, there appeared subordinate factors in the stress suffered by the students. These included stress from their families, from conflicts with their teachers, from the living environment, current schoolwork and future course in college, insecurity over their physical appearance, bullying from other students, and relationships with friends. Among these factors, stress caused by conflicts with teachers and family was the most frequently cited, while stress from their friends was the least cited. The motive in using the Internet was found to be bound with several factors. These include: a form of diversion, a way to communicate with others, a means in coping with loneliness, a source of news and information, a form of passing away time, a kind of habit, and others. Among these motives, passing away time and indulging a habit were cited the most, followed by news and information search, and a form of diversion. Second, as a subordinate factor in Internet addiction, the following were cited: formation of tolerance, health issues, occurrence of problems related to daily life, satisfaction or a pleasant sensation, withdrawal, cover-up on the use of the Internet, and formation of virtual interpersonal relationships, and others. Among these, the formation of tolerance came out the highest, followed by health issues, daily life, and problems related to daily life. Third, in terms of the effects of stress on the motive in using the Internet, the research found that the more the students felt stressed out by conflicts with their teachers and family, the more they tended to use the Internet to communicate with others, to cope with loneliness, to obtain newsI and information, to passawaytime, and to indulge a habit Also, the more they felt stressed out by the living environment, the more they tended to use the Internet to communicate with others, to cope with loneliness, and use news and information. The more they felt stressed out by their schoolwork and future course in college, they tended to use the Internet as a form of diversion and to secure news and information. The more they felt stressed out by their insecurity over their physical appearance and being victimized by bullies, the more they tended to use the Internet to cope with loneliness. Fourth, as for the effect of several variables on student addiction to the Internet, the study found that the more students felt stressed out by their living environment, by schoolwork and future course in college, by their physical appearance, and bullying from other students, the more they used the Internet as a form of diversion, a communication tool, and as a means of passing away time or indulging a habit. The study came up with the finding that the more the students used the computer and the Internet, the probability of their getting addicted to the Internet got higher.
PDF KSCI

A User-centered Classification Framework for Digital Service Innovation : Case for Elderly Care Service

Lim, Hong-Tak;Han, Jeong-Won
- International Journal of Contents
- /
- v.14 no.1
- /
- pp.7-11
- /
- 2018
Digital technology has been changing everyday life of ordinary people let alone the structure of world industry. The elderly care service is also going through changes influenced by the unavoidable impact from torrents of digital technologies. There are numerous reports and news about the digital technologies increasing the efficiency and effectiveness of care service yet lacking systematic understanding of the sources of such improvement. This study aims to present a new classification framework for digital elderly care service innovation to fully utilize the power of digital technologies drawing on insights from innovation studies and service studies. First, 4 features of digital technologies are identified as sources of new value in service innovation. The co-creation of value by users and producers in service and technology development is discussed to illuminate users' contributions to service innovation. Communication of needs and ideas with producers and application of new technologies into everyday practice of life are identified as the source of new value which can be attributed to the elderly. Customization along with efficiency gains is the key to digital elderly care service innovation. The classification framework, thus, incorporates the needs of the elderly as one axis of criteria in the conventional technology-centered framework. The new classification framework would help give due weight to user-driven or demand-driven innovation in the elderly care service R&D activities.
https://doi.org/10.5392/IJoC.2018.14.1.007 인용 PDF KSCI HTML

A Study on the Characteristics of Fashion in Wanna-Be Phenomenon (워너비 현상 (Wanna-Be Phenomenon)에 나타난 패션의 특정 연구)

Yum Hae-Jung;Kim Ji-Seon;Kim Eun-Jung;Park So-Hyun
- Korean Journal of Human Ecology
- /
- v.9 no.2
- /
- pp.53-63
- /
- 2006
The purpose of this study is to analyze the background and characteristics of fashion in wanna-be phenomenon. The primary source of data has been a collection of recent books, news repots, many articles from various kinds of mass media and fashion internet cite. The results of this study can be summarized as follow. First, The wanna-be phenomenon can be divided the background into three parts : change to entertainment society, increase of mass consumption, increase mutual communication with star and fan. Second, the function of fashion in wanna-be phenomenon can be divided with the following: the function of self-expression, guide book of trendy lifestyle, and play for pleasure. Third, fashion style in wanna-be phenomenon can be divided with the following : chic & gorgeous style, sexy casual & chav style, bohemian mix & match style.
PDF

Understanding recurrent neural network for texts using English-Korean corpora

Lee, Hagyeong;Song, Jongwoo
- Communications for Statistical Applications and Methods
- /
- v.27 no.3
- /
- pp.313-326
- /
- 2020
Deep Learning is the most important key to the development of Artificial Intelligence (AI). There are several distinguishable architectures of neural networks such as MLP, CNN, and RNN. Among them, we try to understand one of the main architectures called Recurrent Neural Network (RNN) that differs from other networks in handling sequential data, including time series and texts. As one of the main tasks recently in Natural Language Processing (NLP), we consider Neural Machine Translation (NMT) using RNNs. We also summarize fundamental structures of the recurrent networks, and some topics of representing natural words to reasonable numeric vectors. We organize topics to understand estimation procedures from representing input source sequences to predict target translated sequences. In addition, we apply multiple translation models with Gated Recurrent Unites (GRUs) in Keras on English-Korean sentences that contain about 26,000 pairwise sequences in total from two different corpora, colloquialism and news. We verified some crucial factors that influence the quality of training. We found that loss decreases with more recurrent dimensions and using bidirectional RNN in the encoder when dealing with short sequences. We also computed BLEU scores which are the main measures of the translation performance, and compared them with the score from Google Translate using the same test sentences. We sum up some difficulties when training a proper translation model as well as dealing with Korean language. The use of Keras in Python for overall tasks from processing raw texts to evaluating the translation model also allows us to include some useful functions and vocabulary libraries as well.
https://doi.org/10.29220/CSAM.2020.27.3.313 인용 PDF KSCI

Trend Analysis of the Agricultural Industry Based on Text Analytics

Choi, Solsaem;Kim, Junhwan;Nam, Seungju
- Agribusiness and Information Management
- /
- v.11 no.1
- /
- pp.1-9
- /
- 2019
This research intends to propose the methodology for analyzing the current trends of agriculture, which directly connects to the survival of the nation, and through this methodology, identify the agricultural trend of Korea. Based on the relationship between three types of data - policy reports, academic articles, and news articles - the research deducts the major issues stored by each data through LDA, the representative topic modeling method. By comparing and analyzing the LDA results deducted from each data source, this study intends to identify the implications regarding the current agricultural trends of Korea. This methodology can be utilized in analyzing industrial trends other than agricultural ones. To go on further, it can also be used as a basic resource for contemplation on potential areas in the future through insight on the current situation. database of the profitability of a total of 180 crop types by analyzing Rural Development Administration's survey of agricultural products income of 115 crop types, small land profitability index survey of 53 crop types, and Statistics Korea's survey of production costs of 12 crop types. Furthermore, this research presents the result and developmental process of a web-based crop introduction decision support system that provides overseas cases of new crop introduction support programs, as well as databases of outstanding business success cases of each crop type researched by agricultural institutions.
https://doi.org/10.14771/AIM.11.1.1 인용 PDF

The Influence of Celebrity and Celebrity Fashion on Contemporary Fashion (셀러브리티와 셀러브리티 패션이 현대 패션에 미친 영향)

Kim, So-Ra;Lee, Keum-Hee
- The Research Journal of the Costume Culture
- /
- v.19 no.1
- /
- pp.54-70
- /
- 2011
The purpose of this study is to examine the influence of celebrity and celebrity fashion on contemporary fashion. As study methods, the literature study using books and theses concerning fashion, culture, and history were used for theoretical background and visual data from magazine, news paper, and internet were used for actual study. The results of this study are as follows. First, the celebrity is the figure who shows their attractive appearance and status using major cultural contents on the basis of the various media and visual culture, and has secured a solid foothold as the source of fashion and an indispensable factor of fashion industry. Second, the celebrity fashion, a creature of mass media, set a powerful fashion trend along with the media, and increasingly plays mayor roles in the society and culture. Third, the combination of celebrity and fashion has come into the brightest spotlight as the serious business of today, and bring about a tremendous industrial ripple effect extensively.
PDF KSCI

eBay: Smart Entry Strategy into the Korean Market Through M&A and its Post-Merger Integration

Park, Young-Eun;Allui, Alawiya
- Journal of Distribution Science
- /
- v.17 no.1
- /
- pp.47-56
- /
- 2019
Purpose - This case study illustrates the story of eBay Korea, which owns two most leading Korean open market companies, AUCTION and Gmarket. The main concerns are to take over the Korean top domestic companies one by one, then, emerge these two giants through its well-developed post-merger Integration by eBay, an American multinational corporation. Research design, data, and methodology - This case explores various secondary resources such as periodicals, annual reviews, magazine, news articles, commentaries, even some interview materials related to 'eBay Korea' and industry source on condition of anonymity based on the critical reviewing of existing studies on these topics as well. Results - The findings of this study show the merger and acquisition of two market leaders in Korea is the only successful case in Asian Markets. The eBay's choice of entry mode is appropriate considering the timing, synergy and efficiency by sharing their resources. Conclusions - This study examines the successful entry and settlement process of foreign, multinational company through mergers and acquisitions in the Korean market. This would be a valuable in the studies of International Business and Global entry or distribution strategy in the e-commerce and open market dealing with M&A and its post-merger integration.
https://doi.org/10.15722/jds.17.1.201901.47 인용 PDF HTML

Microblogging Sentiment Investor, Return and Volatility in the COVID-19 Era: Indonesian Stock Exchange

FARISKA, Putri;NUGRAHA, Nugraha;PUTERA, Ika;ROHANDI, Mochamad Malik Akbar;FARISKA, Putri
- The Journal of Asian Finance, Economics and Business
- /
- v.8 no.3
- /
- pp.61-67
- /
- 2021
The covid-19 pandemic scenario caused the most extensive economic shocks the world has experienced in decades. Maintaining financial performance and economic stability is essential during the pandemic period. In these conditions, where movement is severely restricted, media consumption is considered to be increasing. The social media platform is one of the media online used by the public as a source of information and also expressing their sentiment, including individual investors in the capital market as social media users. Twitter is one of the social media microblogging platforms used by individual investors to share their opinion and get information. This study aims to determine whether microblogging sentiment investors can predict the capital market during pandemics. To analyze microblogging sentiment investors, we classified sentiment using the phyton text mining algorithm and Naïve Bayesian text classification into level positive, negative, and neutral from November 2019 to November 2020. This study was on 68 listed companies on the Indonesia stock exchange. A Vector Autoregression and Impulse Response is applied to capture short and long-term impacts along with a causal relationship. We found that microblogging sentiment investor has a significant impact on stock returns and volatility and vice-versa. Also, the response due to shocks is convergent, and microblogging investors in Indonesia are categorized as a "news-watcher" investor.
https://doi.org/10.13106/jafeb.2021.vol8.no3.0061 인용 PDF KSCI HTML

Policy Suggestions to Improve Patient Access to New Drugs in Korea (환자의 신약 접근성 강화 정책 제안)

Choi, Yoona;Lee, Howard
- Korean Journal of Clinical Pharmacy
- /
- v.31 no.1
- /
- pp.1-11
- /
- 2021
Objective: This study aimed to overview and assess the effectiveness of the policies and regulations that have governed new drug access in Korea, and to propose policies to enhance patient access to drugs, particularly for new innovative medicines. Methods: We approached drug access issues in two perspectives: approval lag (or availability) and reimbursement lag (or affordability). The issues were identified and evaluated through the review of literature, public documents, reports published by the government agencies and private organizations, and news articles. Results: To shorten approval lag, it is recommended to hire and train more reviewers at the Ministry of Food and Drug Safety. Increasing user fees to a realistic level can facilitate this process. To reduce reimbursement lag, flexible incremental cost-effectiveness ratio threshold, alternative cost-effectiveness evaluation, and establishment of funding source other than the national health insurance are identified as the areas to be improved. Conclusion: The current policies and regulations had to be supplemented by new systems to drastically promote patient accessibility to new drugs, consequently in order to promote national public health.
https://doi.org/10.24304/kjcp.2021.31.1.1 인용 PDF KSCI

Search Result 101, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)