• Title/Summary/Keyword: Method of characteristics

Search Result 36,573, Processing Time 0.075 seconds

A Survey on Consumption Behaviors of the Fast-Foods in University Students (대학생의 패스트푸드 소비행태에 관한 연구)

  • Cho, Kyu-Seok;Im, Byoung-Soon;Kim, Seok-Eun;Kim, Gye-Woong
    • Korean Journal of Human Ecology
    • /
    • v.14 no.2
    • /
    • pp.313-319
    • /
    • 2005
  • This survey was conducted in order to obtain the basic data for desirable consumption habits through investigation and analysis of university students' fast food consumption behaviors. Questionnaires were collected from a total of 374 male and female students living in big or small and medium-sized cities in August, 2004. The contents surveyed were utilization and expenses of fast foods, choice of fast foods, relationship between fast foods and a diet, and characteristics of fast food restaurants. The results obtained are summarized as follows: 1. The ratio of the surveyees varied according to gender, residence, and the size of a city they're living in. For example, males took up 48.66% of the surveyees, while females did 51.34%. The ratio of residents in apartments and stand-alone houses was 54.81% and 45.19% each. 47.33% of the respondents were living in big cities, while 52.67% of them in small and medium-sized cities. 2. 70.1% of the surveyees responded that they are with friends when having fast foods. There was a highly significant difference between male and female in the type of eating companions (p<0.001). The average number of days that they eat fast foods was 1 to 2 times a week, which accounted for 63.7% of the respondents. However, in the case of eating foods, there was no significant differences between two sexes. 3. 64.2% of the surveyees paid more than 20,000 won to buy fast foods for a week, which showed no significant differences between genders. They tend to split a bill, rather than one person pays all. There was a highly significant difference between genders in paying method (p<0.001). 4. 52.1 % of the respondents chose a menu themselves. Their most favored food was chickens (26.5%), which showed a statistically significant difference between genders (p<0.001). 46.8% of them preferred coke as a drink, which had no significant difference between genders. 42.2% of the surveyees had fast foods between lunch and dinner, which also had no significant difference between genders. The most important factor in choosing a menu was its taste (62.8%), which indicated a significant difference between males and females (p<0.05). 5. The preference to fast foods was due to the influence of western culture (36.4%) and eating-out habits (29.1%), which was significantly different between genders (p<0.05). Those who eat fast foods answered they have normal weight and normal body type (49.5%). 24.3% of them were relatively fat with significant difference between genders (p<0.05). 63.4% of the surveyees thought themselves not picky with foods, and there was a significant difference between genders (p<0.05). 78.3% of them mostly preferred franchise restaurants because they are convenient and cheap.

  • PDF

Clinical Features and Treatment Response in 18 Cases with Idiopathic Nonspecific Interstitial Pneumonia (특발성 비특이성 간질성 폐렴 18례의 임상상 및 치료반응)

  • Kang, Eun-Hae;Chung, Man-Pyo;Kang, Soo-Jung;An, Chang-Hyeok;Ahn, Jong-Woon;Han, Joung-Ho;Lee, Kyung-Soo;Lim, Si-Young;Suh, Gee-Young;Kim, Ho-Joong;Kwon, O-Jung;Rhee, Chong-H.
    • Tuberculosis and Respiratory Diseases
    • /
    • v.48 no.4
    • /
    • pp.530-542
    • /
    • 2000
  • Background : Nonspecific interstitial pneumonia (NSIP) has been reported recently to have shown much better response to medical treatment and better prognosis compared with idiopathic UIP. However, clinical characteristics of idiopathic NSIP discriminating it from UIP have not been clearly defined. Method : Among 120 patients with biopsy-proven diffuse interstitial lung diseases admitted to the Samsung Medical Center between July 1996 and March 2000, 18 patients with idiopathic NSIP were included in this study. Retrospective chart review and radiographic analysis were performed. Results : 1) At diagnosis, 17 patients were female and the average age was $55.2{\pm}8.4$ years (44~73 years). The average duration from development of respiratory symptom to surgical lung biopsy was $9.9{\pm}17.1$ months. Increase in bronchoalveolar lavage fluid lymphocytes ($23.0{\pm}13.1%$) was noted. On HRCT, ground glass and irregular linear opacity were observed, but honeycombing was absent in all patients. 2) Corticosteroids were initially given to 13 patients, but the medication was stopped in 3 patients due to severe side effects. Further medical therapy was not possible in 1 patient who experienced streroid-induced psychosis. Herpes zoster (n=3), tuberculosis (n=1), avascular necrosis of the hip (n=1), cataract (n=2) and diabetes mellitus (n=1) developed during prolonged corticosteroid administration. Of the 7 patients receiving oral cyclophosphamide therapy, hemorrhagic cystitis hindered one patient from continuing with the medication. 3) After medical treatment, 14 of 17 patients improved, and 3 patients remained stable (mean follow-up ; $24.1{\pm}11.2$ months). FVC increased by $20.2{\pm}11.2%$ of predicted value and the extent of ground glass opacity on HRCT decreased significantly ($15.7{\pm}14.7%$). 4) Of the 14 patients who had stopped medication, 5 showed recurrence of NSIP and 2 became aggravated during steroid tapering. All patients with recurrence showed deterioration within one year after completion of initial treatment. Conclusion : Since idiopathic NSIP has unique clinical profiles and shows good prognosis, diagnosis different from UIP, and aggressive medical treatment are needed.

  • PDF

Scale and Scope Economies and Prospect for the Korea's Banking Industry (우리나라 은행산업(銀行産業)의 효율성분석(效率性分析)과 제도개선방안(制度改善方案))

  • Jwa, Sung-hee
    • KDI Journal of Economic Policy
    • /
    • v.14 no.2
    • /
    • pp.109-153
    • /
    • 1992
  • This paper estimates a translog cost function for the Korea's banking industry and derives various implications on the prospect for the Korean banking structure in the future based on the estimated efficiency indicators for the banking sector. The Korean banking industry is permitted to operate trust business to the full extent and the security business to a limited extent, while it is formally subjected to the strict, specialized banking system. Security underwriting and investment businesses are allowed in a very limited extent only for stocks and bonds of maturity longer than three year and only up to 100 percent of the bank paid-in capital. Until the end of 1991, the ceiling was only up to 25 percent of the total balance of the demand deposits. However, they are prohibited from the security brokerage business. While the in-house integration of security businesses with the traditional business of deposit and commercial lending is restrictively regulated as such, Korean banks can enter the security business by establishing subsidiaries in the industry. This paper, therefore, estimates the efficiency indicators as well as the cost functions, identifying the in-house integrated trust business and security investment business as important banking activities, for various cases where both the production and the intermediation function approaches in modelling the financial intermediaries are separately applied, and the banking businesses of deposit, lending and security investment as one group and the trust businesses as another group are separately and integrally analyzed. The estimation results of the efficiency indicators for various cases are summarized in Table 1 and Table 2. First, security businesses exhibit economies of scale but also economies of scope with traditional banking activities, which implies that in-house integration of the banking and security businesses may not be a nonoptimal banking structure. Therefore, this result further implies that the transformation of Korea's banking system from the current, specialized system to the universal banking system will not impede the improvement of the banking industry's efficiency. Second, the lending businesses turn out to be subjected to diseconomies of scale, while exhibiting unclear evidence for economies of scope. In sum, it implies potential efficiency gain of the continued in-house integration of the lending activity. Third, the continued integration of the trust businesses seems to contribute to improving the efficiency of the banking businesses, since the trust businesses exhibit economies of scope. Fourth, deposit services and fee-based activities, such as foreign exchange and credit card businesses, exhibit economies of scale but constant returns to scope, which implies, the possibility of separating those businesses from other banking and trust activities. The recent trend of the credit card business being operated separately from other banking activities by an independent identity in Korea as well as in the global banking market seems to be consistent with this finding. Then, how can the possibility of separating deposit services from the remaining activities be interpreted? If one insists a strict definition of commercial banking that is confined to deposit and commercial lending activities, separating the deposit service will suggest a resolution or a disappearance of banking, itself. Recently, however, there has been a suggestion that separating banks' deposit and lending activities by allowing a depository institution which specialize in deposit taking and investing deposit fund only in the safest securities such as government securities to administer the deposit activity will alleviate the risk of a bank run. This method, in turn, will help improve the safety of the payment system (Robert E. Litan, What should Banks Do? Washington, D.C., The Brookings Institution, 1987). In this context, the possibility of separating the deposit activity will imply that a new type of depository institution will arise naturally without contradicting the efficiency of the banking businesses, as the size of the banking market grows in the future. Moreover, it is also interesting to see additional evidences confirming this statement that deposit taking and security business are cost complementarity but deposit taking and lending businesses are cost substitute (see Table 2 for cost complementarity relationship in Korea's banking industry). Finally, it has been observed that the Korea's banking industry is lacking in the characteristics of natural monopoly. Therefore, it may not be optimal to encourage the merger and acquisition in the banking industry only for the purpose of improving the efficiency.

  • PDF

Wearable Art-Chameleon Dress (웨어러블 아트-카멜레온 드레스)

  • Cho, Kyoung-Hee
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.32 no.12
    • /
    • pp.1837-1847
    • /
    • 2008
  • The goal of this study is to express the image of chameleons-that change their colors by light, temperature and its mood-into the sexy styles of corresponding coquettish temperamental people in Wearable Art. The method used in this study was experimenting various production mediums, including creating the textured stretch fabric, in the process of expressing the conceptual characteristics of the chameleon in Wearable Art. The concept of the work was a concoction of 'tempting', 'splendid', 'brilliant', 'fascinating', etc. that highlighted the real disposition of the chameleon. The futuristic preference of the researcher was also implicated. "Comfortable" and "enjoyable" concepts via motions were improved with the its completeness. The point of the design and production is to express symbolically the chameleon in real life, analyzing its sleek body lines, conditional colors changing, outer skins and the cubic textures. The coquettish temperamental image, the conceptual image of the chameleon, was also expressed by implication into the whole work. The entire line of this work is body-conscious silhouette. It was symbolically selected to image the outline of the chameleon that has the slim and sleek body. The exposed back is intended to express symbolically the projected back bones of the chameleon. The hood of gentle triangle line expresses the smooth-lined head part. The irregular hemlines represent the elongated chameleon's tale. The chameleon with its colors of vivid tones is characterized the colors changing by its conditions. This point was importantly treated in the working process by trying the effects that the colors are seen slightly different according to the light and angles. The material was given the effect that its surface colors are seen different in lights and angles because of the wrinkles protruded lumpy-bumpy. The various stones of red and blue tones are very similar to the skin tones of the real chameleon, and their gradation makes the effect that the colors are visibly changed with each move. The textures of the chameleon were produced via the wrinkle effect of smoke-shape, which is the result of using the elastic threads on the basic mediums stitched with 50/50 chiffon and polyester along with velvet dot patterns. The stretching fabric by the impact of the elastic threads is as much suitable for making the body-conscious line. The stones are composed of acrylic cabochon and gemstone. They are symbolically expressed the lumpy and bumpy back skin of the chameleon and produced the effect of the colors visibly different. The primary technique used in this dress is the draping utilizing the biased grains. The front body piece is connected to the hood and joined to the back piece without any seam. For the irregular hemline flares, leaving the several rectangular pieces with bias grains, they were connected by interlocking. What defines the clothes is the person in action. Therefore, what decides the completeness of clothes might be its comfortable and enjoyable feeling by living and acting people. The chameleon dress could also reach its goal of comforting and pleasing Wearable Art in the process of studying the techniques and effects that visibly differentiate the colors. It is considered as a main point of the Wearable Art, which is a comfortable enjoyable clothing tempered with the artistic beauty.

Community Residents' Knowledge, Attitude, and Needs for Hospice Care (일부 지역주민들의 호스피스에 대한 인지와 태도 및 간호요구 조사)

  • Ro, You-Ja;Han, Sung-Suk;Ahn, Sung-Hee;Yong, Jin-Sun
    • Journal of Hospice and Palliative Care
    • /
    • v.2 no.1
    • /
    • pp.23-35
    • /
    • 1999
  • Purpose : The hospice movement began about 30 years ago in Korea. However, basic studies have seldom been conducted about the general public's knowledge concerning hospice care and their needs for it. The purpose of this study was to investigate the general public's knowledge of and attitude toward hospice, and their needs for hospice care, and to analyze the needs for hospice care in relation to their knowledge and attitude in residents from a specific community. Methods : The survey was conducted with 924 people randomly selected from a district in Seoul. The data were collected through a self-reporting questionnaire constructed by the authors. With 30 items given in the questionnaire, the level of hospice needs showed Cronbach's alpha .89 in a pilot study and .92 in this study and the items were classified into four areas by a factor analysis. The data collected were analyzed by means of t-test and ANOVA. Results : 1) The average age of the respondents was 38. The majority of the respondents were well-educated. 2) Regarding awareness of hospice care, 54%(501 people) indicated they have heard of hospice. About 74% thought that people should be able to prepare for death in advance. About 83% wanted to be informed when they have life threatening illnesses such as terminal cancer. Also, about 63% responded that patients with terminal diseases should be provided with physical, spiritual, and psychological care for minimizing pain and peaceful death. Regarding the attitude toward hospice care, 74% responded that they would use hospice care if needed. The number of the respondents who preferred home visitation by the hospice team to care for the terminally ill ranked first with 34%. Concerning needs for hospice care : 1) By needs area, physical need showed highest mean(M=4.37), followed by social need(M=3.96), emotional need(M=3.87), and the spiritual need(M=3.79). The overall need level showed the mean value of 4.00 which reflects a considerable need for hospice care. 2) By demographic characteristics, people age over 50, the married, and the unemployed indicated higher level of needs for hospice care. Women showed higher level of needs than did men, and Catholics demonstrated higher level of needs than believers of other religion(P<0.0001). 3) As for the knowledge of and attitude toward hospice rare, the level of hospice care needs was significantly higher in the following groups: those who have heard of hospice, those who are aware of death preparation, those who want information on terminal diseases, those who want to use every method to sustain life, and those who are aware of hospice needs(P<0.001). Conclusion : It is assumed that the findings of this study on the knowledge, attitude, and needs for hospice care in the public can contribute to planning a successful hospice care program. Furthermore, the findings of this study will serve as useful data for the promotion of home hospice care to improve the quality of life of community residents, and contribute to the development of hospice care as a whole.

  • PDF

Quality Characteristics of Kiwi Wine and Optimum Malolactic Fermentation Conditions (참다래 와인의 최적 malolactic fermentation 조건과 품질 특성)

  • Kang, Sang-Dong;Ko, Yu-Jin;Kim, Eun-Jung;Son, Yong-Hwi;Kim, Jin-Yong;Seol, Hui-Gyeong;Kim, Ig-Jo;Cho, Hyoun-Kook;Ryu, Chung-Ho
    • Journal of Life Science
    • /
    • v.21 no.4
    • /
    • pp.509-514
    • /
    • 2011
  • Maloactic fermentation (MLF) occurs after completion of alcoholic fermentation and is mediated by lactic acid bacteria (LAB), mainly Oenococcus oeni. Kiwi wine more than commercial grape wine has the problem of high acidity. Therefore, we investigated the optimal MLF conditions for regulating strong acidity and improving the quality properties of wine fermented with Kiwi fruit cultivated in Korea. For alcohol fermentation, industrial wine yeast Saccharomyces cerevisiae KCCM 12650 strains and LAB, known as MLF strains, were used to alleviate wine acidity. First, the various experimental conditions of Kiwi fruit, initial pH (2.5, 3.5, 4.5), fermenting temperature (20, 25, $30^{\circ}C$), and sugar contents (24 $^{\circ}Brix$), were adjusted, and after the fermentation period, we measured the acidity, pH, and the change in organic acid content by the AOAC method and HPLC analysis. The alcohol content of fermented Kiwi wine was 12.75%. Further, total acidity and pH of Kiwi wine were 0.78% and 3.5, respectively. Total sugar and total polyphenol contents of Kiwi wine were 38.72 mg/ml and 60.18 mg/ml, respectively. With regard to organic acid content, the control contained 0.63 mg/ml of oxalic acid, 2.99 mg/ml of malic acid, and 0.71 mg/ml of lactic acid, whereas MLF wine contained 0.69 mg/ml of oxalic acid, 0.06 mg/ml of malic acid, and 3.12 mg/ml of lactic acid. Kiwi wine had lower malic acid values and total acidity than control after MLF processing. In MLF, the optimum initial pH value and fermentation temperature were 3.5 and $25^{\circ}C$, respectively. Therefore, these studies suggest that establishment of optimal MLF conditions could improve the properties of Kiwi wine manufactured in Korea.

Feasibility of Deep Learning Algorithms for Binary Classification Problems (이진 분류문제에서의 딥러닝 알고리즘의 활용 가능성 평가)

  • Kim, Kitae;Lee, Bomi;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.1
    • /
    • pp.95-108
    • /
    • 2017
  • Recently, AlphaGo which is Bakuk (Go) artificial intelligence program by Google DeepMind, had a huge victory against Lee Sedol. Many people thought that machines would not be able to win a man in Go games because the number of paths to make a one move is more than the number of atoms in the universe unlike chess, but the result was the opposite to what people predicted. After the match, artificial intelligence technology was focused as a core technology of the fourth industrial revolution and attracted attentions from various application domains. Especially, deep learning technique have been attracted as a core artificial intelligence technology used in the AlphaGo algorithm. The deep learning technique is already being applied to many problems. Especially, it shows good performance in image recognition field. In addition, it shows good performance in high dimensional data area such as voice, image and natural language, which was difficult to get good performance using existing machine learning techniques. However, in contrast, it is difficult to find deep leaning researches on traditional business data and structured data analysis. In this study, we tried to find out whether the deep learning techniques have been studied so far can be used not only for the recognition of high dimensional data but also for the binary classification problem of traditional business data analysis such as customer churn analysis, marketing response prediction, and default prediction. And we compare the performance of the deep learning techniques with that of traditional artificial neural network models. The experimental data in the paper is the telemarketing response data of a bank in Portugal. It has input variables such as age, occupation, loan status, and the number of previous telemarketing and has a binary target variable that records whether the customer intends to open an account or not. In this study, to evaluate the possibility of utilization of deep learning algorithms and techniques in binary classification problem, we compared the performance of various models using CNN, LSTM algorithm and dropout, which are widely used algorithms and techniques in deep learning, with that of MLP models which is a traditional artificial neural network model. However, since all the network design alternatives can not be tested due to the nature of the artificial neural network, the experiment was conducted based on restricted settings on the number of hidden layers, the number of neurons in the hidden layer, the number of output data (filters), and the application conditions of the dropout technique. The F1 Score was used to evaluate the performance of models to show how well the models work to classify the interesting class instead of the overall accuracy. The detail methods for applying each deep learning technique in the experiment is as follows. The CNN algorithm is a method that reads adjacent values from a specific value and recognizes the features, but it does not matter how close the distance of each business data field is because each field is usually independent. In this experiment, we set the filter size of the CNN algorithm as the number of fields to learn the whole characteristics of the data at once, and added a hidden layer to make decision based on the additional features. For the model having two LSTM layers, the input direction of the second layer is put in reversed position with first layer in order to reduce the influence from the position of each field. In the case of the dropout technique, we set the neurons to disappear with a probability of 0.5 for each hidden layer. The experimental results show that the predicted model with the highest F1 score was the CNN model using the dropout technique, and the next best model was the MLP model with two hidden layers using the dropout technique. In this study, we were able to get some findings as the experiment had proceeded. First, models using dropout techniques have a slightly more conservative prediction than those without dropout techniques, and it generally shows better performance in classification. Second, CNN models show better classification performance than MLP models. This is interesting because it has shown good performance in binary classification problems which it rarely have been applied to, as well as in the fields where it's effectiveness has been proven. Third, the LSTM algorithm seems to be unsuitable for binary classification problems because the training time is too long compared to the performance improvement. From these results, we can confirm that some of the deep learning algorithms can be applied to solve business binary classification problems.

An Essay in a Research on Gwonwu Hong Chan-yu's Poetic Literature - Focussing on Classical Chinese Poems in Gwonwujip (권우(卷宇) 홍찬유(洪贊裕) 시문학(詩文學) 연구(硏究) 시론(試論) - 『권우집(卷宇集)』 소재(所載) 한시(漢詩)를 중심(中心)으로 -)

  • Yoon, Jaehwan
    • (The)Study of the Eastern Classic
    • /
    • no.50
    • /
    • pp.55-88
    • /
    • 2013
  • Gwonwu Hong Chan-yu is one of the modern and contemporary Korean scholars of Sino-Korean literature and one of the literati of his era, so is respected as a guiding light by academic descendants. Gwonwu was a teacher of his era, who experienced all the turbulence of Korean society, such as the Japanese occupation by force, the Korean War, the military dictatorship, and the struggle for democracy, and who educated and led young scholars of his time. However, academia has not payed attention to his life and achievements since his death. This paper is to examine the poetry of Gwonwu Hong Chan-yu, one of the representative modern and contemporary scholar of Sini-Korean literature, which has not yet been discussed by academia. The minimal meaning of this paper is that it is a first work based on his anthology, which has not been discussed by academia, and a first full-scale study on Gwonwu Hongchan-yu. For the reason, this paper aims at the detailed inspection of his poetic pieces recorded in his anthology. Nonetheless, despite such intentions, some limits cannot be avoided here and there in this paper for the insufficient knowledge and academic capability of this paper's writer and for the lack of academic sources. Gwonwu's poetry examined through his anthology shows the characteristic which is that his poems focus on exposing his own internal emotions. Such a characteristic says that his idea of poetic literature payed attention more to individuality, that is exposition of private emotions, than to social utility of poems. Gwonwu's such an idea of poetic literature can be generally affirmed throughout his poetry. Accordingly, Gwonwu preferred classical Chinese poems to archaistic poems, and single poems to serial poems; and avoided writing poems within social relations such as farewell-poems, bestowal-poems, and mourning-poems. When the characteristics of Gwonwu's poetic literature get summarized as such, however, some questions remain. The preferential question is whether the poems in his anthology are the whole poetry of him. Although Gwonwu's poetic pieces that the writer of this paper have checked out till now are all in his anthology, it is very much questionable whether Gwonwu's poetry can be summed up only with these poems. The next question is what is the writing method for taking joy(spice), sentiment, and full-heart into his poems if Gwonwu's poems focus on exposing his internal emotions, and if poems exposing joy and poems exposing sentiment and full-heart appear coherently in various different spaces and circumstances of writing. The final question is what are the meanings of Gwonwu's poems if his poetry checked out through his anthology directly shows either the reality carried in his poems or the reality of a time in his life. The questions listed above are thought to be resolved by the synchronizing process of stereoscopic searches both for Gwonwu as an individual and for the era of his life. Especially, spurring deeper researches toward a new direction regarding Gwonwu's poetry has an important meaning for construction of a complete modern and contemporary history of Sino-Korean literature and for procurement of continuous research on Sino-Korean literature and its history. For the reason, it is thought that more efforts of researchers are required.

Construction of Event Networks from Large News Data Using Text Mining Techniques (텍스트 마이닝 기법을 적용한 뉴스 데이터에서의 사건 네트워크 구축)

  • Lee, Minchul;Kim, Hea-Jin
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.183-203
    • /
    • 2018
  • News articles are the most suitable medium for examining the events occurring at home and abroad. Especially, as the development of information and communication technology has brought various kinds of online news media, the news about the events occurring in society has increased greatly. So automatically summarizing key events from massive amounts of news data will help users to look at many of the events at a glance. In addition, if we build and provide an event network based on the relevance of events, it will be able to greatly help the reader in understanding the current events. In this study, we propose a method for extracting event networks from large news text data. To this end, we first collected Korean political and social articles from March 2016 to March 2017, and integrated the synonyms by leaving only meaningful words through preprocessing using NPMI and Word2Vec. Latent Dirichlet allocation (LDA) topic modeling was used to calculate the subject distribution by date and to find the peak of the subject distribution and to detect the event. A total of 32 topics were extracted from the topic modeling, and the point of occurrence of the event was deduced by looking at the point at which each subject distribution surged. As a result, a total of 85 events were detected, but the final 16 events were filtered and presented using the Gaussian smoothing technique. We also calculated the relevance score between events detected to construct the event network. Using the cosine coefficient between the co-occurred events, we calculated the relevance between the events and connected the events to construct the event network. Finally, we set up the event network by setting each event to each vertex and the relevance score between events to the vertices connecting the vertices. The event network constructed in our methods helped us to sort out major events in the political and social fields in Korea that occurred in the last one year in chronological order and at the same time identify which events are related to certain events. Our approach differs from existing event detection methods in that LDA topic modeling makes it possible to easily analyze large amounts of data and to identify the relevance of events that were difficult to detect in existing event detection. We applied various text mining techniques and Word2vec technique in the text preprocessing to improve the accuracy of the extraction of proper nouns and synthetic nouns, which have been difficult in analyzing existing Korean texts, can be found. In this study, the detection and network configuration techniques of the event have the following advantages in practical application. First, LDA topic modeling, which is unsupervised learning, can easily analyze subject and topic words and distribution from huge amount of data. Also, by using the date information of the collected news articles, it is possible to express the distribution by topic in a time series. Second, we can find out the connection of events in the form of present and summarized form by calculating relevance score and constructing event network by using simultaneous occurrence of topics that are difficult to grasp in existing event detection. It can be seen from the fact that the inter-event relevance-based event network proposed in this study was actually constructed in order of occurrence time. It is also possible to identify what happened as a starting point for a series of events through the event network. The limitation of this study is that the characteristics of LDA topic modeling have different results according to the initial parameters and the number of subjects, and the subject and event name of the analysis result should be given by the subjective judgment of the researcher. Also, since each topic is assumed to be exclusive and independent, it does not take into account the relevance between themes. Subsequent studies need to calculate the relevance between events that are not covered in this study or those that belong to the same subject.

Development of Information Extraction System from Multi Source Unstructured Documents for Knowledge Base Expansion (지식베이스 확장을 위한 멀티소스 비정형 문서에서의 정보 추출 시스템의 개발)

  • Choi, Hyunseung;Kim, Mintae;Kim, Wooju;Shin, Dongwook;Lee, Yong Hun
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.111-136
    • /
    • 2018
  • In this paper, we propose a methodology to extract answer information about queries from various types of unstructured documents collected from multi-sources existing on web in order to expand knowledge base. The proposed methodology is divided into the following steps. 1) Collect relevant documents from Wikipedia, Naver encyclopedia, and Naver news sources for "subject-predicate" separated queries and classify the proper documents. 2) Determine whether the sentence is suitable for extracting information and derive the confidence. 3) Based on the predicate feature, extract the information in the proper sentence and derive the overall confidence of the information extraction result. In order to evaluate the performance of the information extraction system, we selected 400 queries from the artificial intelligence speaker of SK-Telecom. Compared with the baseline model, it is confirmed that it shows higher performance index than the existing model. The contribution of this study is that we develop a sequence tagging model based on bi-directional LSTM-CRF using the predicate feature of the query, with this we developed a robust model that can maintain high recall performance even in various types of unstructured documents collected from multiple sources. The problem of information extraction for knowledge base extension should take into account heterogeneous characteristics of source-specific document types. The proposed methodology proved to extract information effectively from various types of unstructured documents compared to the baseline model. There is a limitation in previous research that the performance is poor when extracting information about the document type that is different from the training data. In addition, this study can prevent unnecessary information extraction attempts from the documents that do not include the answer information through the process for predicting the suitability of information extraction of documents and sentences before the information extraction step. It is meaningful that we provided a method that precision performance can be maintained even in actual web environment. The information extraction problem for the knowledge base expansion has the characteristic that it can not guarantee whether the document includes the correct answer because it is aimed at the unstructured document existing in the real web. When the question answering is performed on a real web, previous machine reading comprehension studies has a limitation that it shows a low level of precision because it frequently attempts to extract an answer even in a document in which there is no correct answer. The policy that predicts the suitability of document and sentence information extraction is meaningful in that it contributes to maintaining the performance of information extraction even in real web environment. The limitations of this study and future research directions are as follows. First, it is a problem related to data preprocessing. In this study, the unit of knowledge extraction is classified through the morphological analysis based on the open source Konlpy python package, and the information extraction result can be improperly performed because morphological analysis is not performed properly. To enhance the performance of information extraction results, it is necessary to develop an advanced morpheme analyzer. Second, it is a problem of entity ambiguity. The information extraction system of this study can not distinguish the same name that has different intention. If several people with the same name appear in the news, the system may not extract information about the intended query. In future research, it is necessary to take measures to identify the person with the same name. Third, it is a problem of evaluation query data. In this study, we selected 400 of user queries collected from SK Telecom 's interactive artificial intelligent speaker to evaluate the performance of the information extraction system. n this study, we developed evaluation data set using 800 documents (400 questions * 7 articles per question (1 Wikipedia, 3 Naver encyclopedia, 3 Naver news) by judging whether a correct answer is included or not. To ensure the external validity of the study, it is desirable to use more queries to determine the performance of the system. This is a costly activity that must be done manually. Future research needs to evaluate the system for more queries. It is also necessary to develop a Korean benchmark data set of information extraction system for queries from multi-source web documents to build an environment that can evaluate the results more objectively.