• Title/Summary/Keyword: Label

Search Result 1,650, Processing Time 0.033 seconds

The Safety and Immunogenicity of a Trivalent, Live, Attenuated MMR Vaccine, PriorixTM (MMR(Measles-Mumps-Rubella) 약독화 생백신인 프리오릭스주를 접종한 후 안전성과 유효성의 평가에 관한 연구)

  • Ahn, Seung-In;Chung, Min-Kook;Yoo, Jung-Suk;Chung, Hye-Jeon;Hur, Jae-Kyun;Shin, Young-Kyu;Chang, Jin-Keun;Cha, Sung-Ho
    • Clinical and Experimental Pediatrics
    • /
    • v.48 no.9
    • /
    • pp.960-968
    • /
    • 2005
  • Purpose : This multi-center, open-label, clinical study was designed to evaluate the safety and immunogenicity of a trivalent, live, attenuated measles-mumps-rubella(MMR) vaccine, $Priorix^{TM}$ in Korean children. Methods : From July 2002 to February 2003, a total of 252 children, aged 12-15 months or 4-6 years, received $Priorix^{TM}$ at four centers : Han-il General Hospital, Kyunghee University Hospital, St. Paul's Hospital at the Catholic Medical College in Seoul, and Korea University Hospital in Ansan, Korea. Only subjects who fully met protocol requirements were included in the final analysis. The occurrence of local and systemic adverse events after vaccination was evaluated from diary cards and physical examination for 42 days after vaccination. Serum antibody levels were measured prior to and 42 days post-vaccination using IgG ELISA assays at GlaxoSmithKline Biologicals (GSK) in Belgium. Results : Of the 252 enrolled subjects, a total of 199 were included in the safety analysis, including 103 from the 12-15 month age group and 96 from the 4-6 year age group. The occurrence of local reactions related to the study drug was 10.1 percent, and the occurrence of systemic reactions was 6.5 percent. There were no episodes of aseptic meningitis or febrile convulsions, nor any other serious adverse reaction. In immunogenicity analysis, the seroconversion rate of previously seronegative subjects was 99 percent for measles, 93 percent for mumps and 100 percent for rubella. Both age groups showed similar seroconversion rates. The geometric mean titers achieved, 42 days pos-tvaccination, were : For measles, in the age group 12-15 months, 3,838.6 mIU/mL [3,304.47, 4,458.91]; in the age group 4-6 years, 1,886.2 mIU/mL [825.83, 4,308.26]. For mumps, in the age group 12-15 months, 956.3 U/mL [821.81, 1,112.71]; in the age group 4-6 years, 2,473.8 U/mL [1,518.94, 4,028.92]. For rubella, in the age group 12-15 months, 94.5 IU/mL [79.56, 112.28]; in the age group 4-6 years, 168.9 IU/mL [108.96, 261.90]. Conclusion : When Korean children in the age groups of 12-15 months or 4-6 years were vaccinated with GlaxoSmithKline Biologicals' live attenuated MMR vaccine ($Priorix^{TM}$), adverse events were limited to those generally expected with any live vaccine. $Priorix^{TM}$ demonstrated excellent immunogenicity in this population.

The Risk Assessment of Butachlor for the Freshwater Aquatic Organisms (Butachlor의 수서생물에 대한 위해성 평가)

  • Park, Yeon-Ki;Bae, Chul-Han;Kim, Byung-Seok;Lee, Jea-Bong;You, Are-Sun;Hong, Soon-Sung;Park, Kyung-Hoon;Shin, Jin-Sup;Hong, Moo-Ki;Lee, Kyu-Seung;Lee, Jung-Ho
    • The Korean Journal of Pesticide Science
    • /
    • v.13 no.1
    • /
    • pp.1-12
    • /
    • 2009
  • To assess the effect of butachlor on freshwater aquatic organisms, acute toxicity studies for algae, invertebrate and fishes were conducted. The algae grow inhibition studies were carried out to determine the growth inhibition effects of butachlor (Tech. 93.4%) in Pseudokirchneriella subcapitata (formerly knows as Selenastrum capriconutum), Desmodesmus subspicatus (formerly known as Scendusmus subspicatus), and Chlorella vulgaris during the exposure period of 72 hours. The toxicological responses of P. subcapitata, D. subspicatus, and C. vulgaris to butachlor, expressed in individual $ErC_{50}$ values were 0.002, 0.019, and $10.4mgL^{-1}$, respectively and NOEC values were 0.0008, 0.0016, and $5.34mg\;L^{-1}$, respectively. P. subcapitata was more sensitive than any other algae species. Butachlor has very high toxicity to the algae, such as P. subcapitata and D. subspicatu. In the acute immobilisation test for Daphnia magna, the 24 and $48h-EC_{50}$ values were 2.55 and $1.50mg\;L^{-1}$, respectively. As the results of the acute toxicity test on Cyprinus carpio, Oryzias latipes and Misgurnus anguillicaudatus, the $96h-LC_{50}s$ were 0.62, 0.41 and $0.24mg\;L^{-1}$, respectively. The following ecological risk assessment of butachlor was performed on the basis of the toxicological data of algae, invertebrate and fish and exposure concentrations in rice paddy, drain and river. When a butachlor formulation is applied in rice paddy field according to label recommendation, the measured concentration of butachlor in paddy water was $0.41mg\;L^{-1}$ and the predicted environmental concentration (PEC) of butachlor in drain water was $0.03 mg\;L^{-1}$. Residues of butachlor detected in major rivers between 1997 and 1998 were ranged from $0.0004mg\;L^{-1}$ to $0.0029mg\;L^{-1}$. Toxicity exposure ratios (TERs) of algae in rice paddy, drain and river were 0.004, 0.05 and 0.36, respectively and indicated that butachlor has a risk to algae in rice paddy, drain and river. On the other hand, TERs of invertebrate in rice paddy, drain and river were 3.6, 50 and 357, respectively, well above 2, indicating no risk to invertebrate. TERs of fish in rice paddy, drain and river were 0.58, 8 and 57, respectively. The TERs for fish indicated that butachlor poses a risk to fish in rice paddy but has no risk to fish in agricultural drain and river. In conclusion, butachlor has a minimal risk to algae in agricultural drain and river exposed from rice drainage but has no risk to invertebrate and fish.

Development of Predictive Models for Rights Issues Using Financial Analysis Indices and Decision Tree Technique (경영분석지표와 의사결정나무기법을 이용한 유상증자 예측모형 개발)

  • Kim, Myeong-Kyun;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.4
    • /
    • pp.59-77
    • /
    • 2012
  • This study focuses on predicting which firms will increase capital by issuing new stocks in the near future. Many stakeholders, including banks, credit rating agencies and investors, performs a variety of analyses for firms' growth, profitability, stability, activity, productivity, etc., and regularly report the firms' financial analysis indices. In the paper, we develop predictive models for rights issues using these financial analysis indices and data mining techniques. This study approaches to building the predictive models from the perspective of two different analyses. The first is the analysis period. We divide the analysis period into before and after the IMF financial crisis, and examine whether there is the difference between the two periods. The second is the prediction time. In order to predict when firms increase capital by issuing new stocks, the prediction time is categorized as one year, two years and three years later. Therefore Total six prediction models are developed and analyzed. In this paper, we employ the decision tree technique to build the prediction models for rights issues. The decision tree is the most widely used prediction method which builds decision trees to label or categorize cases into a set of known classes. In contrast to neural networks, logistic regression and SVM, decision tree techniques are well suited for high-dimensional applications and have strong explanation capabilities. There are well-known decision tree induction algorithms such as CHAID, CART, QUEST, C5.0, etc. Among them, we use C5.0 algorithm which is the most recently developed algorithm and yields performance better than other algorithms. We obtained data for the rights issue and financial analysis from TS2000 of Korea Listed Companies Association. A record of financial analysis data is consisted of 89 variables which include 9 growth indices, 30 profitability indices, 23 stability indices, 6 activity indices and 8 productivity indices. For the model building and test, we used 10,925 financial analysis data of total 658 listed firms. PASW Modeler 13 was used to build C5.0 decision trees for the six prediction models. Total 84 variables among financial analysis data are selected as the input variables of each model, and the rights issue status (issued or not issued) is defined as the output variable. To develop prediction models using C5.0 node (Node Options: Output type = Rule set, Use boosting = false, Cross-validate = false, Mode = Simple, Favor = Generality), we used 60% of data for model building and 40% of data for model test. The results of experimental analysis show that the prediction accuracies of data after the IMF financial crisis (59.04% to 60.43%) are about 10 percent higher than ones before IMF financial crisis (68.78% to 71.41%). These results indicate that since the IMF financial crisis, the reliability of financial analysis indices has increased and the firm intention of rights issue has been more obvious. The experiment results also show that the stability-related indices have a major impact on conducting rights issue in the case of short-term prediction. On the other hand, the long-term prediction of conducting rights issue is affected by financial analysis indices on profitability, stability, activity and productivity. All the prediction models include the industry code as one of significant variables. This means that companies in different types of industries show their different types of patterns for rights issue. We conclude that it is desirable for stakeholders to take into account stability-related indices and more various financial analysis indices for short-term prediction and long-term prediction, respectively. The current study has several limitations. First, we need to compare the differences in accuracy by using different data mining techniques such as neural networks, logistic regression and SVM. Second, we are required to develop and to evaluate new prediction models including variables which research in the theory of capital structure has mentioned about the relevance to rights issue.

The way to make training data for deep learning model to recognize keywords in product catalog image at E-commerce (온라인 쇼핑몰에서 상품 설명 이미지 내의 키워드 인식을 위한 딥러닝 훈련 데이터 자동 생성 방안)

  • Kim, Kitae;Oh, Wonseok;Lim, Geunwon;Cha, Eunwoo;Shin, Minyoung;Kim, Jongwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.1-23
    • /
    • 2018
  • From the 21st century, various high-quality services have come up with the growth of the internet or 'Information and Communication Technologies'. Especially, the scale of E-commerce industry in which Amazon and E-bay are standing out is exploding in a large way. As E-commerce grows, Customers could get what they want to buy easily while comparing various products because more products have been registered at online shopping malls. However, a problem has arisen with the growth of E-commerce. As too many products have been registered, it has become difficult for customers to search what they really need in the flood of products. When customers search for desired products with a generalized keyword, too many products have come out as a result. On the contrary, few products have been searched if customers type in details of products because concrete product-attributes have been registered rarely. In this situation, recognizing texts in images automatically with a machine can be a solution. Because bulk of product details are written in catalogs as image format, most of product information are not searched with text inputs in the current text-based searching system. It means if information in images can be converted to text format, customers can search products with product-details, which make them shop more conveniently. There are various existing OCR(Optical Character Recognition) programs which can recognize texts in images. But existing OCR programs are hard to be applied to catalog because they have problems in recognizing texts in certain circumstances, like texts are not big enough or fonts are not consistent. Therefore, this research suggests the way to recognize keywords in catalog with the Deep Learning algorithm which is state of the art in image-recognition area from 2010s. Single Shot Multibox Detector(SSD), which is a credited model for object-detection performance, can be used with structures re-designed to take into account the difference of text from object. But there is an issue that SSD model needs a lot of labeled-train data to be trained, because of the characteristic of deep learning algorithms, that it should be trained by supervised-learning. To collect data, we can try labelling location and classification information to texts in catalog manually. But if data are collected manually, many problems would come up. Some keywords would be missed because human can make mistakes while labelling train data. And it becomes too time-consuming to collect train data considering the scale of data needed or costly if a lot of workers are hired to shorten the time. Furthermore, if some specific keywords are needed to be trained, searching images that have the words would be difficult, as well. To solve the data issue, this research developed a program which create train data automatically. This program can make images which have various keywords and pictures like catalog and save location-information of keywords at the same time. With this program, not only data can be collected efficiently, but also the performance of SSD model becomes better. The SSD model recorded 81.99% of recognition rate with 20,000 data created by the program. Moreover, this research had an efficiency test of SSD model according to data differences to analyze what feature of data exert influence upon the performance of recognizing texts in images. As a result, it is figured out that the number of labeled keywords, the addition of overlapped keyword label, the existence of keywords that is not labeled, the spaces among keywords and the differences of background images are related to the performance of SSD model. This test can lead performance improvement of SSD model or other text-recognizing machine based on deep learning algorithm with high-quality data. SSD model which is re-designed to recognize texts in images and the program developed for creating train data are expected to contribute to improvement of searching system in E-commerce. Suppliers can put less time to register keywords for products and customers can search products with product-details which is written on the catalog.

Business Application of Convolutional Neural Networks for Apparel Classification Using Runway Image (합성곱 신경망의 비지니스 응용: 런웨이 이미지를 사용한 의류 분류를 중심으로)

  • Seo, Yian;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.1-19
    • /
    • 2018
  • Large amount of data is now available for research and business sectors to extract knowledge from it. This data can be in the form of unstructured data such as audio, text, and image data and can be analyzed by deep learning methodology. Deep learning is now widely used for various estimation, classification, and prediction problems. Especially, fashion business adopts deep learning techniques for apparel recognition, apparel search and retrieval engine, and automatic product recommendation. The core model of these applications is the image classification using Convolutional Neural Networks (CNN). CNN is made up of neurons which learn parameters such as weights while inputs come through and reach outputs. CNN has layer structure which is best suited for image classification as it is comprised of convolutional layer for generating feature maps, pooling layer for reducing the dimensionality of feature maps, and fully-connected layer for classifying the extracted features. However, most of the classification models have been trained using online product image, which is taken under controlled situation such as apparel image itself or professional model wearing apparel. This image may not be an effective way to train the classification model considering the situation when one might want to classify street fashion image or walking image, which is taken in uncontrolled situation and involves people's movement and unexpected pose. Therefore, we propose to train the model with runway apparel image dataset which captures mobility. This will allow the classification model to be trained with far more variable data and enhance the adaptation with diverse query image. To achieve both convergence and generalization of the model, we apply Transfer Learning on our training network. As Transfer Learning in CNN is composed of pre-training and fine-tuning stages, we divide the training step into two. First, we pre-train our architecture with large-scale dataset, ImageNet dataset, which consists of 1.2 million images with 1000 categories including animals, plants, activities, materials, instrumentations, scenes, and foods. We use GoogLeNet for our main architecture as it has achieved great accuracy with efficiency in ImageNet Large Scale Visual Recognition Challenge (ILSVRC). Second, we fine-tune the network with our own runway image dataset. For the runway image dataset, we could not find any previously and publicly made dataset, so we collect the dataset from Google Image Search attaining 2426 images of 32 major fashion brands including Anna Molinari, Balenciaga, Balmain, Brioni, Burberry, Celine, Chanel, Chloe, Christian Dior, Cividini, Dolce and Gabbana, Emilio Pucci, Ermenegildo, Fendi, Giuliana Teso, Gucci, Issey Miyake, Kenzo, Leonard, Louis Vuitton, Marc Jacobs, Marni, Max Mara, Missoni, Moschino, Ralph Lauren, Roberto Cavalli, Sonia Rykiel, Stella McCartney, Valentino, Versace, and Yve Saint Laurent. We perform 10-folded experiments to consider the random generation of training data, and our proposed model has achieved accuracy of 67.2% on final test. Our research suggests several advantages over previous related studies as to our best knowledge, there haven't been any previous studies which trained the network for apparel image classification based on runway image dataset. We suggest the idea of training model with image capturing all the possible postures, which is denoted as mobility, by using our own runway apparel image dataset. Moreover, by applying Transfer Learning and using checkpoint and parameters provided by Tensorflow Slim, we could save time spent on training the classification model as taking 6 minutes per experiment to train the classifier. This model can be used in many business applications where the query image can be runway image, product image, or street fashion image. To be specific, runway query image can be used for mobile application service during fashion week to facilitate brand search, street style query image can be classified during fashion editorial task to classify and label the brand or style, and website query image can be processed by e-commerce multi-complex service providing item information or recommending similar item.

The Content of Minerals and Vitamins in Commercial Beverages and Liquid Teas (유통음료 및 액상차 중의 비타민과 미네랄 함량)

  • Shin, Young;Kim, Sung-Dan;Kim, Bog-Soon;Yun, Eun-Sun;Chang, Min-Su;Jung, Sun-Ok;Lee, Yong-Cheol;Kim, Jung-Hun;Chae, Young-Zoo
    • Journal of Food Hygiene and Safety
    • /
    • v.26 no.4
    • /
    • pp.322-329
    • /
    • 2011
  • This study was done to analyze the contents of minerals and vitamins to compare the measured values of minerals, vitamins with labeled values of them in food labeling and to investigate the ratio of measured values to labeled values in 437 specimen with minerals and vitamins - fortified commercial beverages and liquid teas. Content of calcium and sodium in samples after microwave digestion was analyzed with an ICP-OES (Inductively Coupled Plasma Optical Emission Spectrometer) and vitamins were determined using by HPLC (High Performance Liquid Chromatography). The measured values of calcium were ranged 80.3~142.6% of the labeled values in 21 samples composed calcium - fortified commercial beverages and liquid teas. In case of sodium, measured values were investigated 33.9~48.5% of the labeled values in 21 sports beverages. The measured values of vitamin C, vitamin $B_2$ and niacin were ranged 99.7~2003.6, 81.1~336.7, 90.7~393.2% of the labeled values in vitamins - fortified commercial beverages and liquid teas, 57, 12, 11 samples. To support achievement of the accurate nutrition label, there must be program and initiatives for better understanding and guidances on food labelling and nutrition for food manufacture.

Query-based Answer Extraction using Korean Dependency Parsing (의존 구문 분석을 이용한 질의 기반 정답 추출)

  • Lee, Dokyoung;Kim, Mintae;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.161-177
    • /
    • 2019
  • In this paper, we study the performance improvement of the answer extraction in Question-Answering system by using sentence dependency parsing result. The Question-Answering (QA) system consists of query analysis, which is a method of analyzing the user's query, and answer extraction, which is a method to extract appropriate answers in the document. And various studies have been conducted on two methods. In order to improve the performance of answer extraction, it is necessary to accurately reflect the grammatical information of sentences. In Korean, because word order structure is free and omission of sentence components is frequent, dependency parsing is a good way to analyze Korean syntax. Therefore, in this study, we improved the performance of the answer extraction by adding the features generated by dependency parsing analysis to the inputs of the answer extraction model (Bidirectional LSTM-CRF). The process of generating the dependency graph embedding consists of the steps of generating the dependency graph from the dependency parsing result and learning the embedding of the graph. In this study, we compared the performance of the answer extraction model when inputting basic word features generated without the dependency parsing and the performance of the model when inputting the addition of the Eojeol tag feature and dependency graph embedding feature. Since dependency parsing is performed on a basic unit of an Eojeol, which is a component of sentences separated by a space, the tag information of the Eojeol can be obtained as a result of the dependency parsing. The Eojeol tag feature means the tag information of the Eojeol. The process of generating the dependency graph embedding consists of the steps of generating the dependency graph from the dependency parsing result and learning the embedding of the graph. From the dependency parsing result, a graph is generated from the Eojeol to the node, the dependency between the Eojeol to the edge, and the Eojeol tag to the node label. In this process, an undirected graph is generated or a directed graph is generated according to whether or not the dependency relation direction is considered. To obtain the embedding of the graph, we used Graph2Vec, which is a method of finding the embedding of the graph by the subgraphs constituting a graph. We can specify the maximum path length between nodes in the process of finding subgraphs of a graph. If the maximum path length between nodes is 1, graph embedding is generated only by direct dependency between Eojeol, and graph embedding is generated including indirect dependencies as the maximum path length between nodes becomes larger. In the experiment, the maximum path length between nodes is adjusted differently from 1 to 3 depending on whether direction of dependency is considered or not, and the performance of answer extraction is measured. Experimental results show that both Eojeol tag feature and dependency graph embedding feature improve the performance of answer extraction. In particular, considering the direction of the dependency relation and extracting the dependency graph generated with the maximum path length of 1 in the subgraph extraction process in Graph2Vec as the input of the model, the highest answer extraction performance was shown. As a result of these experiments, we concluded that it is better to take into account the direction of dependence and to consider only the direct connection rather than the indirect dependence between the words. The significance of this study is as follows. First, we improved the performance of answer extraction by adding features using dependency parsing results, taking into account the characteristics of Korean, which is free of word order structure and omission of sentence components. Second, we generated feature of dependency parsing result by learning - based graph embedding method without defining the pattern of dependency between Eojeol. Future research directions are as follows. In this study, the features generated as a result of the dependency parsing are applied only to the answer extraction model in order to grasp the meaning. However, in the future, if the performance is confirmed by applying the features to various natural language processing models such as sentiment analysis or name entity recognition, the validity of the features can be verified more accurately.

A Study of Anomaly Detection for ICT Infrastructure using Conditional Multimodal Autoencoder (ICT 인프라 이상탐지를 위한 조건부 멀티모달 오토인코더에 관한 연구)

  • Shin, Byungjin;Lee, Jonghoon;Han, Sangjin;Park, Choong-Shik
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.57-73
    • /
    • 2021
  • Maintenance and prevention of failure through anomaly detection of ICT infrastructure is becoming important. System monitoring data is multidimensional time series data. When we deal with multidimensional time series data, we have difficulty in considering both characteristics of multidimensional data and characteristics of time series data. When dealing with multidimensional data, correlation between variables should be considered. Existing methods such as probability and linear base, distance base, etc. are degraded due to limitations called the curse of dimensions. In addition, time series data is preprocessed by applying sliding window technique and time series decomposition for self-correlation analysis. These techniques are the cause of increasing the dimension of data, so it is necessary to supplement them. The anomaly detection field is an old research field, and statistical methods and regression analysis were used in the early days. Currently, there are active studies to apply machine learning and artificial neural network technology to this field. Statistically based methods are difficult to apply when data is non-homogeneous, and do not detect local outliers well. The regression analysis method compares the predictive value and the actual value after learning the regression formula based on the parametric statistics and it detects abnormality. Anomaly detection using regression analysis has the disadvantage that the performance is lowered when the model is not solid and the noise or outliers of the data are included. There is a restriction that learning data with noise or outliers should be used. The autoencoder using artificial neural networks is learned to output as similar as possible to input data. It has many advantages compared to existing probability and linear model, cluster analysis, and map learning. It can be applied to data that does not satisfy probability distribution or linear assumption. In addition, it is possible to learn non-mapping without label data for teaching. However, there is a limitation of local outlier identification of multidimensional data in anomaly detection, and there is a problem that the dimension of data is greatly increased due to the characteristics of time series data. In this study, we propose a CMAE (Conditional Multimodal Autoencoder) that enhances the performance of anomaly detection by considering local outliers and time series characteristics. First, we applied Multimodal Autoencoder (MAE) to improve the limitations of local outlier identification of multidimensional data. Multimodals are commonly used to learn different types of inputs, such as voice and image. The different modal shares the bottleneck effect of Autoencoder and it learns correlation. In addition, CAE (Conditional Autoencoder) was used to learn the characteristics of time series data effectively without increasing the dimension of data. In general, conditional input mainly uses category variables, but in this study, time was used as a condition to learn periodicity. The CMAE model proposed in this paper was verified by comparing with the Unimodal Autoencoder (UAE) and Multi-modal Autoencoder (MAE). The restoration performance of Autoencoder for 41 variables was confirmed in the proposed model and the comparison model. The restoration performance is different by variables, and the restoration is normally well operated because the loss value is small for Memory, Disk, and Network modals in all three Autoencoder models. The process modal did not show a significant difference in all three models, and the CPU modal showed excellent performance in CMAE. ROC curve was prepared for the evaluation of anomaly detection performance in the proposed model and the comparison model, and AUC, accuracy, precision, recall, and F1-score were compared. In all indicators, the performance was shown in the order of CMAE, MAE, and AE. Especially, the reproduction rate was 0.9828 for CMAE, which can be confirmed to detect almost most of the abnormalities. The accuracy of the model was also improved and 87.12%, and the F1-score was 0.8883, which is considered to be suitable for anomaly detection. In practical aspect, the proposed model has an additional advantage in addition to performance improvement. The use of techniques such as time series decomposition and sliding windows has the disadvantage of managing unnecessary procedures; and their dimensional increase can cause a decrease in the computational speed in inference.The proposed model has characteristics that are easy to apply to practical tasks such as inference speed and model management.

A Proposal of a Keyword Extraction System for Detecting Social Issues (사회문제 해결형 기술수요 발굴을 위한 키워드 추출 시스템 제안)

  • Jeong, Dami;Kim, Jaeseok;Kim, Gi-Nam;Heo, Jong-Uk;On, Byung-Won;Kang, Mijung
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.1-23
    • /
    • 2013
  • To discover significant social issues such as unemployment, economy crisis, social welfare etc. that are urgent issues to be solved in a modern society, in the existing approach, researchers usually collect opinions from professional experts and scholars through either online or offline surveys. However, such a method does not seem to be effective from time to time. As usual, due to the problem of expense, a large number of survey replies are seldom gathered. In some cases, it is also hard to find out professional persons dealing with specific social issues. Thus, the sample set is often small and may have some bias. Furthermore, regarding a social issue, several experts may make totally different conclusions because each expert has his subjective point of view and different background. In this case, it is considerably hard to figure out what current social issues are and which social issues are really important. To surmount the shortcomings of the current approach, in this paper, we develop a prototype system that semi-automatically detects social issue keywords representing social issues and problems from about 1.3 million news articles issued by about 10 major domestic presses in Korea from June 2009 until July 2012. Our proposed system consists of (1) collecting and extracting texts from the collected news articles, (2) identifying only news articles related to social issues, (3) analyzing the lexical items of Korean sentences, (4) finding a set of topics regarding social keywords over time based on probabilistic topic modeling, (5) matching relevant paragraphs to a given topic, and (6) visualizing social keywords for easy understanding. In particular, we propose a novel matching algorithm relying on generative models. The goal of our proposed matching algorithm is to best match paragraphs to each topic. Technically, using a topic model such as Latent Dirichlet Allocation (LDA), we can obtain a set of topics, each of which has relevant terms and their probability values. In our problem, given a set of text documents (e.g., news articles), LDA shows a set of topic clusters, and then each topic cluster is labeled by human annotators, where each topic label stands for a social keyword. For example, suppose there is a topic (e.g., Topic1 = {(unemployment, 0.4), (layoff, 0.3), (business, 0.3)}) and then a human annotator labels "Unemployment Problem" on Topic1. In this example, it is non-trivial to understand what happened to the unemployment problem in our society. In other words, taking a look at only social keywords, we have no idea of the detailed events occurring in our society. To tackle this matter, we develop the matching algorithm that computes the probability value of a paragraph given a topic, relying on (i) topic terms and (ii) their probability values. For instance, given a set of text documents, we segment each text document to paragraphs. In the meantime, using LDA, we can extract a set of topics from the text documents. Based on our matching process, each paragraph is assigned to a topic, indicating that the paragraph best matches the topic. Finally, each topic has several best matched paragraphs. Furthermore, assuming there are a topic (e.g., Unemployment Problem) and the best matched paragraph (e.g., Up to 300 workers lost their jobs in XXX company at Seoul). In this case, we can grasp the detailed information of the social keyword such as "300 workers", "unemployment", "XXX company", and "Seoul". In addition, our system visualizes social keywords over time. Therefore, through our matching process and keyword visualization, most researchers will be able to detect social issues easily and quickly. Through this prototype system, we have detected various social issues appearing in our society and also showed effectiveness of our proposed methods according to our experimental results. Note that you can also use our proof-of-concept system in http://dslab.snu.ac.kr/demo.html.

The actual aspects of North Korea's 1950s Changgeuk through the Chunhyangjeon in the film Moranbong(1958) and the album Corée Moranbong(1960) (영화 <모란봉>(1958)과 음반 (1960) 수록 <춘향전>을 통해 본 1950년대 북한 창극의 실제적 양상)

  • Song, Mi-Kyoung
    • (The) Research of the performance art and culture
    • /
    • no.43
    • /
    • pp.5-46
    • /
    • 2021
  • The film Moranbong is the product of a trip to North Korea in 1958, when Armangati, Chris Marker, Claude Lantzmann, Francis Lemarck and Jean-Claude Bonardo left at the invitation of Joseon Film. However, for political reasons, the film was not immediately released, and it was not until 2010 that it was rediscovered and received attention. The movie consists of the narratives of Young-ran and Dong-il, set in the Korean War, that are folded into the narratives of Chunhyang and Mongryong in the classic Chunhyangjeon of Joseon. At this time, Joseon's classics are reproduced in the form of the drama Chunhyangjeon, which shares the time zone with the two main characters, and the two narratives are covered in a total of six scenes. There are two layers of middle-story frames in the movie, and if the same narrative is set in North Korea in the 1950s, there is an epic produced by the producers and actors of the Changgeuk Chunhyangjeon and the Changgeuk Chunhyangjeon as a complete work. In the outermost frame of the movie, Dong-il is the main character, but in the inner double frame, Young-ran, who is an actor growing up with the Changgeuk Chunhyangjeon and a character in the Changgeuk Chunhyangjeon, is the center. The following three OST albums are Corée Moranbong released in France in 1960, Musique de corée released in 1970, and 朝鮮の伝統音樂-唱劇 「春香伝」と伝統樂器- released in 1968 in Japan. While Corée Moranbong consists only of the music from the film Moranbong, the two subsequent albums included additional songs collected and recorded by Pyongyang National Broadcasting System. However, there is no information about the movie Moranbong on the album released in Japan. Under the circumstances, it is highly likely that the author of the record label or music commentary has not confirmed the existence of the movie Moranbong, and may have intentionally excluded related contents due to the background of the film's ban on its release. The results of analyzing the detailed scenes of the Changgeuk Chunhyangjeon, Farewell Song, Sipjang-ga, Chundangsigwa, Bakseokti and Prison Song in the movie Moranbong or OST album in the 1950s are as follows. First, the process of establishing the North Korean Changgeuk Chunhyangjeon in the 1950s was confirmed. The play, compiled in 1955 through the Joseon Changgeuk Collection, was settled in the form of a Changgeuk that can be performed in the late 1950s by the Changgeuk Chunhyangjeon between 1956 and 1958. Since the 1960s, Chunhyangjeon has no longer been performed as a traditional pansori-style Changgeuk, so the film Moranbong and the album Corée moranbong are almost the last records to capture the Changgeuk Chunhyangjeon and its music. Second, we confirmed the responses of the actors to the controversy over Takseong in the North Korean creative world in the 1950s. Until 1959, there was a voice of criticism surrounding Takseong and a voice of advocacy that it was also a national characteristic. Shin Woo-sun, who almost eliminated Takseong with clear and high-pitched phrases, air man who changed according to the situation, who chose Takseong but did not actively remove Takseong, Lim So-hyang, who tried to maintain his own tone while accepting some of modern vocalization. Although Cho Sang-sun and Lim So-hyang were also guaranteed roles to continue their voices, the selection/exclusion patterns in the movie Moranbong were linked to the Takseong removal guidelines required by North Korean musicians in the name of Dang and People in the 1950s. Second, Changgeuk actors' response to the controversy over the turbidity of the North Korean Changgeuk community in the 1950s was confirmed. Until 1959, there were voices of criticism and support surrounding Taksung in North Korea. Shin Woo-sun, who showed consistent performance in removing turbidity with clear, high-pitched vocal sounds, Gong Gi-nam, who did not actively remove turbidity depending on the situation, Cho Sang-sun, who accepted some of the vocalization required by the party, while maintaining his original tone. On the other hand, Cho Sang-seon and Lim So-hyang were guaranteed roles to continue their sounds, but the selection/exclusion patterns of Moranbong was independently linked to the guidelines for removing turbidity that the Gugak musicians who crossed to North Korea had been asked for.