• 제목/요약/키워드: 선정과 평가

검색결과 7,375건 처리시간 0.037초

Development of Information Extraction System from Multi Source Unstructured Documents for Knowledge Base Expansion (지식베이스 확장을 위한 멀티소스 비정형 문서에서의 정보 추출 시스템의 개발)

  • Choi, Hyunseung;Kim, Mintae;Kim, Wooju;Shin, Dongwook;Lee, Yong Hun
    • Journal of Intelligence and Information Systems
    • /
    • 제24권4호
    • /
    • pp.111-136
    • /
    • 2018
  • In this paper, we propose a methodology to extract answer information about queries from various types of unstructured documents collected from multi-sources existing on web in order to expand knowledge base. The proposed methodology is divided into the following steps. 1) Collect relevant documents from Wikipedia, Naver encyclopedia, and Naver news sources for "subject-predicate" separated queries and classify the proper documents. 2) Determine whether the sentence is suitable for extracting information and derive the confidence. 3) Based on the predicate feature, extract the information in the proper sentence and derive the overall confidence of the information extraction result. In order to evaluate the performance of the information extraction system, we selected 400 queries from the artificial intelligence speaker of SK-Telecom. Compared with the baseline model, it is confirmed that it shows higher performance index than the existing model. The contribution of this study is that we develop a sequence tagging model based on bi-directional LSTM-CRF using the predicate feature of the query, with this we developed a robust model that can maintain high recall performance even in various types of unstructured documents collected from multiple sources. The problem of information extraction for knowledge base extension should take into account heterogeneous characteristics of source-specific document types. The proposed methodology proved to extract information effectively from various types of unstructured documents compared to the baseline model. There is a limitation in previous research that the performance is poor when extracting information about the document type that is different from the training data. In addition, this study can prevent unnecessary information extraction attempts from the documents that do not include the answer information through the process for predicting the suitability of information extraction of documents and sentences before the information extraction step. It is meaningful that we provided a method that precision performance can be maintained even in actual web environment. The information extraction problem for the knowledge base expansion has the characteristic that it can not guarantee whether the document includes the correct answer because it is aimed at the unstructured document existing in the real web. When the question answering is performed on a real web, previous machine reading comprehension studies has a limitation that it shows a low level of precision because it frequently attempts to extract an answer even in a document in which there is no correct answer. The policy that predicts the suitability of document and sentence information extraction is meaningful in that it contributes to maintaining the performance of information extraction even in real web environment. The limitations of this study and future research directions are as follows. First, it is a problem related to data preprocessing. In this study, the unit of knowledge extraction is classified through the morphological analysis based on the open source Konlpy python package, and the information extraction result can be improperly performed because morphological analysis is not performed properly. To enhance the performance of information extraction results, it is necessary to develop an advanced morpheme analyzer. Second, it is a problem of entity ambiguity. The information extraction system of this study can not distinguish the same name that has different intention. If several people with the same name appear in the news, the system may not extract information about the intended query. In future research, it is necessary to take measures to identify the person with the same name. Third, it is a problem of evaluation query data. In this study, we selected 400 of user queries collected from SK Telecom 's interactive artificial intelligent speaker to evaluate the performance of the information extraction system. n this study, we developed evaluation data set using 800 documents (400 questions * 7 articles per question (1 Wikipedia, 3 Naver encyclopedia, 3 Naver news) by judging whether a correct answer is included or not. To ensure the external validity of the study, it is desirable to use more queries to determine the performance of the system. This is a costly activity that must be done manually. Future research needs to evaluate the system for more queries. It is also necessary to develop a Korean benchmark data set of information extraction system for queries from multi-source web documents to build an environment that can evaluate the results more objectively.

A Study on the Psychosocial Characteristics and Quality of Life in Functional Gastrointestinal Disorders (기능성위장질환 환자들의 정신사회적 특성 및 삶의 질의 관계에 관한 연구)

  • Kim, So-Won;Jang, Seung-Ho;Ryu, Han-Seung;Choi, Suck-Chei;Rho, Seung-Ho;Lee, Sang-Yeol
    • Korean Journal of Psychosomatic Medicine
    • /
    • 제27권1호
    • /
    • pp.25-34
    • /
    • 2019
  • Objectives : This study aimed to compare the psychosocial characteristics among patients with functional gastrointestinal disorder (FGID), adults with functional gastrointestinal symptoms, and normal control group and investigate factors related to quality of life (QoL) of FGID patients. Methods : 65 patients diagnosed with FGID were selected. 79 adults were selected as normal control group based on the Rome III diagnostic criteria, and 88 adults who showed functional gastrointestinal symptoms were selected as "FGID positive group". Demographic factors were investigated. Psychosocial factors were evaluated using the Korean-Beck Depression Inventory-II, Korean-Beck Anxiety Inventory, Korean-Childhood Trauma Questionnaire, Multi-dimensional Scale of Perceived Social Support, Connor-Davidson Resilience Scale and WHO Quality of Life Assessment Instrument Brief Form. A one-way ANOVA was used to compare differences among groups. Pearson correlation test was used to analyze correlations between QoL and psychosocial factors in patients with FGID. Results : There were group differences in the education level. Depression (F=29.012, p<0.001), anxiety (F=27.954, p<0.001) and Childhood trauma (F=7.748, p<0.001) were significantly higher in FGID patient group than in both FGID-positive and normal control group. Social support (F=5,123, p<0.001), Resilience (F=9.623, p<0.001) and QoL (F=35.991, p<0.001) were significantly lower in the FGID patient group than in others. QoL of FGID patients showed a positive correlation with resilience (r=0.475, p<0.01), and showed a negative correlation with depression (r=-0.641, p<0.01), anxiety (r=-0.641, p<0.01), and childhood trauma (r=-0.278, p<0.05). Conclusions : FGID patients have distinctive psychosocial factors compared to the both FGID-positive and normal control group. Therefore, the active interventions for psychosocial factors are required in the treatment of patients with FGID.

A Study on Management of Records for Accountability of University student body's autonomy activity - Focused on Myongji University's student body - (대학 총학생회 자치활동의 설명책임성을 위한 기록관리 방안 연구 - 명지대학교 총학생회를 중심으로 -)

  • Lee, Yu Bin;Lee, Seung Hwi
    • The Korean Journal of Archival Studies
    • /
    • 제29호
    • /
    • pp.175-223
    • /
    • 2011
  • A university is an organization charged with publicity and has accountability to the community for the operating process. Students account for a majority of members in a university. In universities, numerous creatures are pouring out every year and university students are major producers of these records. However, roles and functions of university students producing enormous amount of records as main agents of universities and focused concentration on produced records have not been made yet. It is reality that from the archival point of view, the importance of produced records of which main agents are university students has been relatively underestimated. In this background, this study attempted approach in archival point of view on records produced by university students, main agents. There are various types of records that university students produce such as records produced in the process of research and teaching as well as records produced in the process of various autonomy activities like clubs, students' associations. This study especially focused on university student autonomy activity process and placed emphasis on accountability securing measures on autonomy activity process of university students. To secure accountability of activities, records management should be based. Therefore, as a way to ensure accountability of unversity students autonomy activity, we tried to present records management systematization and records utilization measures. For this, a student body, a university student autonomy organization was analyzed and a student body of Myongji University Humanities Campus was selected as a specific target. First, to identify records management status, activities and organization and functions of the student body, we conducted an interview with the president of the student body. Through this, we analyzed the activities of the university student body and examined the necessity of accountability accordingly. Also, we derived the types and characteristics of records to be produced at each stage by analyzing the organization and functions of the student body of Myongji University. Like this, after deriving the types of production records according to the necessity, organization and functions of accountability and activities of the student body, we analyzed records management status of the present student body. First, to identify the general process status of activities of the student body, we analyzed activity process by stage of the student body of Myongji University. And we analyzed records management method of the student body and responsibility principal and conducted real condition analysis. Through this analysis, we presented the measures to ensure accountability of a university student body in three categories such as systematization of records management process, establishment of records management infrastructure, accountability guarantee measures. This study discussed accountability on society by analyzing activities and functions of a student body, targeting a student body, an autonomy organization of university students. And as a measure to secure accountability of a student body, we proposed a model for records management environment settlement. But in terms that a student body is an organization operated in one year basis, there is a limit that records management environment is hard to settle. This study pointed out this limit and was to provide clues when more active researches were carried out in the field of student records management in the future through presentation of student body records management model. Also, it is expected that the analysis results derived from this research will have significance in terms of school history arrangement and conservation.

Analysis of promising countries for export using parametric and non-parametric methods based on ERGM: Focusing on the case of information communication and home appliance industries (ERGM 기반의 모수적 및 비모수적 방법을 활용한 수출 유망국가 분석: 정보통신 및 가전 산업 사례를 중심으로)

  • Jun, Seung-pyo;Seo, Jinny;Yoo, Jae-Young
    • Journal of Intelligence and Information Systems
    • /
    • 제28권1호
    • /
    • pp.175-196
    • /
    • 2022
  • Information and communication and home appliance industries, which were one of South Korea's main industries, are gradually losing their export share as their export competitiveness is weakening. This study objectively analyzed export competitiveness and suggested export-promising countries in order to help South Korea's information communication and home appliance industries improve exports. In this study, network properties, centrality, and structural hole analysis were performed during network analysis to evaluate export competitiveness. In order to select promising export countries, we proposed a new variable that can take into account the characteristics of an already established International Trade Network (ITN), that is, the Global Value Chain (GVC), in addition to the existing economic factors. The conditional log-odds for individual links derived from the Exponential Random Graph Model (ERGM) in the analysis of the cross-border trade network were assumed as a proxy variable that can indicate the export potential. In consideration of the possibility of ERGM linkage, a parametric approach and a non-parametric approach were used to recommend export-promising countries, respectively. In the parametric method, a regression analysis model was developed to predict the export value of the information and communication and home appliance industries in South Korea by additionally considering the link-specific characteristics of the network derived from the ERGM to the existing economic factors. Also, in the non-parametric approach, an abnormality detection algorithm based on the clustering method was used, and a promising export country was proposed as a method of finding outliers that deviate from two peers. According to the research results, the structural characteristic of the export network of the industry was a network with high transferability. Also, according to the centrality analysis result, South Korea's influence on exports was weak compared to its size, and the structural hole analysis result showed that export efficiency was weak. According to the model for recommending promising exporting countries proposed by this study, in parametric analysis, Iran, Ireland, North Macedonia, Angola, and Pakistan were promising exporting countries, and in nonparametric analysis, Qatar, Luxembourg, Ireland, North Macedonia and Pakistan were analyzed as promising exporting countries. There were differences in some countries in the two models. The results of this study revealed that the export competitiveness of South Korea's information and communication and home appliance industries in GVC was not high compared to the size of exports, and thus showed that exports could be further reduced. In addition, this study is meaningful in that it proposed a method to find promising export countries by considering GVC networks with other countries as a way to increase export competitiveness. This study showed that, from a policy point of view, the international trade network of the information communication and home appliance industries has an important mutual relationship, and although transferability is high, it may not be easily expanded to a three-party relationship. In addition, it was confirmed that South Korea's export competitiveness or status was lower than the export size ranking. This paper suggested that in order to improve the low out-degree centrality, it is necessary to increase exports to Italy or Poland, which had significantly higher in-degrees. In addition, we argued that in order to improve the centrality of out-closeness, it is necessary to increase exports to countries with particularly high in-closeness. In particular, it was analyzed that Morocco, UAE, Argentina, Russia, and Canada should pay attention as export countries. This study also provided practical implications for companies expecting to expand exports. The results of this study argue that companies expecting export expansion need to pay attention to countries with a relatively high potential for export expansion compared to the existing export volume by country. In particular, for companies that export daily necessities, countries that should pay attention to the population are presented, and for companies that export high-end or durable products, countries with high GDP, or purchasing power, relatively low exports are presented. Since the process and results of this study can be easily extended and applied to other industries, it is also expected to develop services that utilize the results of this study in the public sector.

Preservation of World Records Heritage in Korea and Further Registry (한국의 세계기록유산 보존 현황 및 과제)

  • Kim, Sung-Soo
    • Journal of Korean Society of Archives and Records Management
    • /
    • 제5권2호
    • /
    • pp.27-48
    • /
    • 2005
  • This study investigates the current preservation and management of four records and documentary heritage in Korea that is in the UNESCO's Memory of the World Register. The study analyzes their problems and corresponding solutions in digitizing those world records heritages. This study also reviews additional four documentary books in Korea that are in the wish list to add to UNESCO's Memory of the World Register. This study is organized as the following: Chapter 2 examines the value and meanings of world records and documentary heritage in Korea. The registry requirements and procedures of UNESCO's Memory of the World Register are examined. The currently registered records of Korea include Hunmin-Chongum, the Annals of the Choson Dynasty, the Diaries of the Royal Secretariat (Seungjeongwon Ilgi), and Buljo- Jikji-Simche-Yojeol (vol. II). These records heritage's worth and significance are carefully analyzed. For example, Hunmin-Chongum("訓民正音") is consisted of unique and systematic letters. Letters were delicately explained with examples in its original manual at the time of letter's creation, which is an unparalleled case in the world documentary history. The Annals of the Choson Dynasty("朝鮮王朝實錄") are the most comprehensive historic documents that contain the longest period of time in history. Their truthfulness and reliability in describing history give credits to the annals. The Royal Secretariat Diary (called Seungjeongwon-Ilgi("承政院日記")) is the most voluminous primary resources in history, superior to the Annals of Choson Dynasty and Twenty Five Histories in China. Jikji("直指") is the oldest existing book published by movable metal print sets in the world. It evidences the beginning of metal printing in the world printing history and is worthy of being as world heritage. The review of the four registered records confirms that they are valuable world documentary heritage that transfers culture of mankind to next generations and should be preserved carefully and safely without deterioration or loss. Chapter 3 investigates the current status of preservation and management of three repositories that store the four registered records in Korea. The repositories include Kyujanggak Archives in Seoul National University, Pusan Records and Information Center of National Records and Archives Service, and Gansong Art Museum. The quality of their preservation and management are excellent in all of three institutions by the following aspects: 1) detailed security measures are close to perfection 2) archiving practices are very careful by using a special stack room in steady temperature and humidity and depositing it in stack or archival box made of paulownia tree and 3) fire prevention, lighting, and fumigation are thoroughly prepared. Chapter 4 summarizes the status quo of digitization projects of records heritage in Korea. The most important issue related to digitization and database construction on Korean records heritage is likely to set up the standardization of digitization processes and facilities. It is urgently necessary to develop comprehensive standard systems for digitization. Two institutions are closely interested in these tasks: 1) the National Records and Archives Service experienced in developing government records management systems; and 2) the Cultural Heritage Administration interested in digitization of Korean old documents. In collaboration of these two institutions, a new standard system will be designed for digitizing records heritage on Korean Studies. Chapter 5 deals with additional Korean records heritage in the wish list for UNESCO's Memory of the World Register, including: 1) Wooden Printing Blocks(經板) of Koryo-Taejangkyong(高麗大藏經) in Haein Temple(海印寺); 2) Dongui-Bogam("東醫寶鑑") 3) Samguk-Yusa("三國遺事") and 4) Mugujeonggwangdaedaranigyeong. Their world value and importance are examined as followings. Wooden Printing Blocks of Koryo-Taejangkyong in Haein Temple is the worldly oldest wooden printing block of cannon of Buddhism that still exist and was created over 750 years ago. It needs a special conservation treatment to disinfect germs residing in surface and inside of wooden plates. Otherwise, it may be damaged seriously. For its effective conservation and preservation, we hope that UNESCO and Government will schedule special care and budget and join the list of Memory of the Word Register. Dongui-Bogam is the most comprehensive and well-written medical book in the Korean history, summarizing all medical books in Korea and China from the Ancient Times through the early 17th century and concentrating on Korean herb medicine and prescriptions. It is proved as the best clinical guidebook in the 17th century for doctors and practitioners to easily use. The book was also published in China and Japan in the 18th century and greatly influenced the development of practical clinic and medical research in Asia at that time. This is why Dongui Bogam is in the wish list to register to the Memory of the World. Samguk-Yusa is evaluated as one of the most comprehensive history books and treasure sources in Korea, which illustrates foundations of Korean people and covers histories and cultures of ancient Korean peninsula and nearby countries. The book contains the oldest fixed form verse, called Hyang-Ka(鄕歌), and became the origin of Korean literature. In particular, the section of Gi-ee(紀異篇) describes the historical processes of dynasty transition from the first dynasty Gochosun(古朝鮮) to Goguryeo(高句麗) and illustrates the identity of Korean people from its historical origin. This book is worthy of adding to the Memory of the World Register. Mugujeonggwangdaedaranigyeong is the oldest book printed by wooden type plates, and it is estimated to print in between 706 and 751. It contains several reasons and evidence to be worthy of adding to the list of the Memory of the World. It is the greatest documentary heritage that represents the first wooden printing book that still exists in the world as well as illustrates the history of wooden printing in Korea.