• Title/Summary/Keyword: Generalized

Search Result 8,513, Processing Time 0.04 seconds

The way to make training data for deep learning model to recognize keywords in product catalog image at E-commerce (온라인 쇼핑몰에서 상품 설명 이미지 내의 키워드 인식을 위한 딥러닝 훈련 데이터 자동 생성 방안)

  • Kim, Kitae;Oh, Wonseok;Lim, Geunwon;Cha, Eunwoo;Shin, Minyoung;Kim, Jongwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.1-23
    • /
    • 2018
  • From the 21st century, various high-quality services have come up with the growth of the internet or 'Information and Communication Technologies'. Especially, the scale of E-commerce industry in which Amazon and E-bay are standing out is exploding in a large way. As E-commerce grows, Customers could get what they want to buy easily while comparing various products because more products have been registered at online shopping malls. However, a problem has arisen with the growth of E-commerce. As too many products have been registered, it has become difficult for customers to search what they really need in the flood of products. When customers search for desired products with a generalized keyword, too many products have come out as a result. On the contrary, few products have been searched if customers type in details of products because concrete product-attributes have been registered rarely. In this situation, recognizing texts in images automatically with a machine can be a solution. Because bulk of product details are written in catalogs as image format, most of product information are not searched with text inputs in the current text-based searching system. It means if information in images can be converted to text format, customers can search products with product-details, which make them shop more conveniently. There are various existing OCR(Optical Character Recognition) programs which can recognize texts in images. But existing OCR programs are hard to be applied to catalog because they have problems in recognizing texts in certain circumstances, like texts are not big enough or fonts are not consistent. Therefore, this research suggests the way to recognize keywords in catalog with the Deep Learning algorithm which is state of the art in image-recognition area from 2010s. Single Shot Multibox Detector(SSD), which is a credited model for object-detection performance, can be used with structures re-designed to take into account the difference of text from object. But there is an issue that SSD model needs a lot of labeled-train data to be trained, because of the characteristic of deep learning algorithms, that it should be trained by supervised-learning. To collect data, we can try labelling location and classification information to texts in catalog manually. But if data are collected manually, many problems would come up. Some keywords would be missed because human can make mistakes while labelling train data. And it becomes too time-consuming to collect train data considering the scale of data needed or costly if a lot of workers are hired to shorten the time. Furthermore, if some specific keywords are needed to be trained, searching images that have the words would be difficult, as well. To solve the data issue, this research developed a program which create train data automatically. This program can make images which have various keywords and pictures like catalog and save location-information of keywords at the same time. With this program, not only data can be collected efficiently, but also the performance of SSD model becomes better. The SSD model recorded 81.99% of recognition rate with 20,000 data created by the program. Moreover, this research had an efficiency test of SSD model according to data differences to analyze what feature of data exert influence upon the performance of recognizing texts in images. As a result, it is figured out that the number of labeled keywords, the addition of overlapped keyword label, the existence of keywords that is not labeled, the spaces among keywords and the differences of background images are related to the performance of SSD model. This test can lead performance improvement of SSD model or other text-recognizing machine based on deep learning algorithm with high-quality data. SSD model which is re-designed to recognize texts in images and the program developed for creating train data are expected to contribute to improvement of searching system in E-commerce. Suppliers can put less time to register keywords for products and customers can search products with product-details which is written on the catalog.

Clinicopathologic features of Acute Interstitial Pneumonia (급성 간질성 폐렴의 임상적 고찰)

  • Shim, Jae-Jeong;Park, Sang-Muyn;Lee, Sang-Hwa;Lee, Jin-Gu;Cho, Jae-Yun;Song, Gwan-Gyu;In, Kwang-Ho;Yoo, Se-Hwa;Kang, Kyung-Ho
    • Tuberculosis and Respiratory Diseases
    • /
    • v.42 no.1
    • /
    • pp.58-66
    • /
    • 1995
  • Background: Acute interstitial pneumonia is a relatively rare form of interstitial pneumonia, since the vast majority of interstitial pneumonia have a more chronic course. It corresponds to the lesion described by Hamman and Rich, as Hamman-Rich disease in 1944. Another name in the clinical literature is accelerated interstitial pneumonia, idiopathic acute respiratory distress syndrome (idiopathic ARDS), and the organizing stage of diffuse alveolar damage. Acute interstitial pneumonia differs from chronic interstitial pneumonia by clinical and pathologic features. Clinically, this disease is characterized by a sudden onset and a rapid course, and reversible disease. Method and Purpose: Five cases of pathologically proven acute interstitial pneumonia were retrospectively studied to define the clinical, radiologic, and pathologic features. Results: 1) The five cases ranged in age from 31 to 77 years old. The onset of illness was acute in all patients, it began with viral-like prodrome 6~40 days prior to shortness of breath, and respiratory failure eventually developed in all patients. In 2 cases, generalized skin rash was accompanied with flu-like symptoms. Etiologic agent could not be identified in any case. 2) All patients had leukocytosis and severe hypoxemia. Pulmonary function test of 3 available cases shows restrictive ventilatory defect, and one survived patient(case 5) has a complete improvement of pulmonary function after dismissal. 3) Diffuse bilateral chest infiltrates were present radiologically. Theses were the ground-glass, consolidation, and reticular densities without honeycomb fibrosis in all patients. The pathologic abnormalities were the presence of increased numbers of macrophages and the formation of hyaline membranes within alveolar spaces. There was also interstitial thickening with edema, proliferation of immature fibroblast, and hyperplasia of type II pneumocyte. In the survived patient(case5), pathologic findings were relatively early stage of acute interstitial pneumonia, such as hyaline membrane with mild interstitial fibrosis. 4) Of the 5 patients, four patients died of respiratory failure 14~90 days after onset of first symptom, and one survived and recovered in symptoms, chest X ray, and pulmonary function test Conclusion: These results emphasize that acute interstitial pneumonia is clinically, radiologically, and pathologically distinct form of interstitial pneumonia and should be separated from the group of chronic interstitial pneumonia. Further studies will be needed to evaluate the pathogenesis and the treatment of acute interstitial pneumonia.

  • PDF

The Relationship Between DEA Model-based Eco-Efficiency and Economic Performance (DEA 모형 기반의 에코효율성과 경제적 성과의 연관성)

  • Kim, Myoung-Jong
    • Journal of Environmental Policy
    • /
    • v.13 no.4
    • /
    • pp.3-49
    • /
    • 2014
  • Growing interest of stakeholders on corporate responsibilities for environment and tightening environmental regulations are highlighting the importance of environmental management more than ever. However, companies' awareness of the importance of environment is still falling behind, and related academic works have not shown consistent conclusions on the relationship between environmental performance and economic performance. One of the reasons is different ways of measuring these two performances. The evaluation scope of economic performance is relatively narrow and the performance can be measured by a unified unit such as price, while the scope of environmental performance is diverse and a wide range of units are used for measuring environmental performances instead of using a single unified unit. Therefore, the results of works can be different depending on the performance indicators selected. In order to resolve this problem, generalized and standardized performance indicators should be developed. In particular, the performance indicators should be able to cover the concepts of both environmental and economic performances because the recent idea of environmental management has expanded to encompass the concept of sustainability. Another reason is that most of the current researches tend to focus on the motive of environmental investments and environmental performance, and do not offer a guideline for an effective implementation strategy for environmental management. For example, a process improvement strategy or a market discrimination strategy can be deployed through comparing the environment competitiveness among the companies in the same or similar industries, so that a virtuous cyclical relationship between environmental and economic performances can be secured. A novel method for measuring eco-efficiency by utilizing Data Envelopment Analysis (DEA), which is able to combine multiple environmental and economic performances, is proposed in this report. Based on the eco-efficiencies, the environmental competitiveness is analyzed and the optimal combination of inputs and outputs are recommended for improving the eco-efficiencies of inefficient firms. Furthermore, the panel analysis is applied to the causal relationship between eco-efficiency and economic performance, and the pooled regression model is used to investigate the relationship between eco-efficiency and economic performance. The four-year eco-efficiencies between 2010 and 2013 of 23 companies are obtained from the DEA analysis; a comparison of efficiencies among 23 companies is carried out in terms of technical efficiency(TE), pure technical efficiency(PTE) and scale efficiency(SE), and then a set of recommendations for optimal combination of inputs and outputs are suggested for the inefficient companies. Furthermore, the experimental results with the panel analysis have demonstrated the causality from eco-efficiency to economic performance. The results of the pooled regression have shown that eco-efficiency positively affect financial perform ances(ROA and ROS) of the companies, as well as firm values(Tobin Q, stock price, and stock returns). This report proposes a novel approach for generating standardized performance indicators obtained from multiple environmental and economic performances, so that it is able to enhance the generality of relevant researches and provide a deep insight into the sustainability of environmental management. Furthermore, using efficiency indicators obtained from the DEA model, the cause of change in eco-efficiency can be investigated and an effective strategy for environmental management can be suggested. Finally, this report can be a motive for environmental management by providing empirical evidence that environmental investments can improve economic performance.

  • PDF

An Analysis on Factors Affecting Local Control and Survival in Nasopharvngeal Carcinoma (비인두암의 국소 종양 치유와 생존율에 관한 예후 인자 분석)

  • Chung Woong-Ki;Cho Jae-Shik;Park Seung Jin;Lee Jae-Hong;Ahn Sung Ja;Nam Taek Keun;Choi Chan;Noh Young Hee;Nah Byung Sik
    • Radiation Oncology Journal
    • /
    • v.17 no.2
    • /
    • pp.91-99
    • /
    • 1999
  • Propose : This study was performed to find out the prognostic factors affecting local control, survival and disease free survival rate in nasopharyngeal carcinomas treated with chemotherapy and radiation therapy. Materials and Methods : We analysed 47 patients of nasopharyngeal carcinomas, histologically confirmed and treated at Chonnam University Hospital between July 1986 and June 1996, retrospectively. Range of patients' age were from 16 to 80 years (median; 52 years). Thirty three (70$\%$) patients was male. Histological types were composed of 3 (6$\%$) keratinizing, 30 (64$\%$) nonkeratinizing squamous cell carcinoma and 13 (28$\%$) undifferentiated carcinoma. Histoiogicai type was not known in 1 patient (2$\%$). We restaged according to the staging system of 1997 American Joint Committee on Cancer Forty seven patients were recorded as follows: 71: 11 (23$\%$), T2a; 6 (13$\%$), T2b; 9 (19$\%$), 73; 7 (15$\%$), 74: 14 (30$\%$), and NO; 7 (15$\%$), Nl: 14 (30$\%$), N2; 21 (45%), N3: 5 (10%). Clinical staging was grouped as follows: Stage 1; 2 (4$\%$), IIA: 2 (4$\%$), IIB; 10 (21$\%$), III; 14 (30$\%$), IVA; 14 (30$\%$) and IVB; 5 (11$\%$). Radiation therapy was done using 6 MV and 10 MV X- ray of linear accelerator. Electron beam was used for the Iymph nodes of posterior neck after 4500 cGy. The range of total radiation dose delivered to the primary tumor was from 6120 to 7920 cGy (median; 7020 cGy). Neoadjuvant chemotherapy was performed with cisplatin +5-fluorouracil (25 patients) or cisplatin+pepleomycin (17 patients) with one to three cycles. Five patients did not received chemotherapy. Local control rate, survival and disease free suwival rate were calculated by Kaplan-Meier method. Generalized Wilcoxon test was used to evaluate the difference of survival rates between groups. multivariate analysis using Cox proportional hazard model was done for finding prognostic factors. Results: Local control rate was 81$\%$ in 5 year. Five year survival rate was 60$\%$ (median survival; 100 months). We included age, sex, cranial nerve deflicit, histologic type, stage group, chemotherapy, elapsed days between chemotherapy and radiotherapy, total radiation dose, period of radiotherapy as potential prognostic factors in multivariate analysis. As a result, cranial none deficit (P=0.004) had statistical significance in local control rate. Stage group and total radiation dose were significant prognostic factors in survival (P=0.000, P=0.012), and in disease free survival rates (P=0.003, P=0.008), respectively. Common complications were xerostomia, tooth and ear problems. Hypothyroidism was developed in 2 patients. Conclusion : In our study, cranial none deficit was a significant prognostic factor in local control rate, and stage group and total radiation dose were significant factors in both survival and disease free survival of nasopharyngeal carcinoma. We have concluded that chemotherapy and radiotherapy used in our patients were effective without any serious complication.

  • PDF

The Effects of Evaluation Attributes of Cultural Tourism Festivals on Satisfaction and Behavioral Intention (문화관광축제 방문객의 평가속성 만족과 행동의도에 관한 연구 - 2006 광주김치대축제를 중심으로 -)

  • Kim, Jung-Hoon
    • Journal of Global Scholars of Marketing Science
    • /
    • v.17 no.2
    • /
    • pp.55-73
    • /
    • 2007
  • Festivals are an indispensable feature of cultural tourism(Formica & Uysal, 1998). Cultural tourism festivals are increasingly being used as instruments promoting tourism and boosting the regional economy. So much research related to festivals is undertaken from a variety of perspectives. Plans to revisit a particular festival have been viewed as an important research topic both in academia and the tourism industry. Therefore festivals have frequently been leveled as cultural events. Cultural tourism festivals have become a crucial component in constituting the attractiveness of tourism destinations(Prentice, 2001). As a result, a considerable number of tourist studies have been carried out in diverse cultural tourism festivals(Backman et al., 1995; Crompton & Mckay, 1997; Park, 1998; Clawson & Knetch, 1996). Much of previous literature empirically shows the close linkage between tourist satisfaction and behavioral intention in festivals. The main objective of this study is to investigate the effects of evaluation attributes of cultural tourism festivals on satisfaction and behavioral intention. accomplish the research objective, to find out evaluation items of cultural tourism festivals through the literature study an empirical study. Using a varimax rotation with Kaiser normalization, the research obtained four factors in the 18 evaluation attributes of cultural tourism festivals. Some empirical studies have examined the relationship between behavioral intention and actual behavior. To understand between tourist satisfaction and behavioral intention, this study suggests five hypotheses and hypothesized model. In this study, the analysis is based on primary data collected from visitors who participated in '2006 Gwangju Kimchi Festival'. In total, 700 self-administered questionnaires were distributed and 561 usable questionnaires were obtained. Respondents were presented with the 18 satisfactions item on a scale from 1(strongly disagree) to 7(strongly agree). Dimensionality and stability of the scale were evaluated by a factor analysis with varimax rotation. Four factors emerged with eigenvalues greater than 1, which explained 66.40% of the total variance and Cronbach' alpha raging from 0.876 to 0.774. And four factors named: advertisement and guides, programs, food and souvenirs, and convenient facilities. To test and estimate the hypothesized model, a two-step approach with an initial measurement model and a subsequent structural model for Structural Equation Modeling was used. The AMOS 4.0 analysis package was used to conduct the analysis. In estimating the model, the maximum likelihood procedure was used.In this study Chi-square test is used, which is the most common model goodness-of-fit test. In addition, considering the literature about the Structural Equation Modeling, this study used, besides Chi-square test, more model fit indexes to determine the tangibility of the suggested model: goodness-of-fit index(GFI) and root mean square error of approximation(RMSEA) as absolute fit indexes; normed-fit index(NFI) and non-normed-fit index(NNFI) as incremental fit indexes. The results of T-test and ANOVAs revealed significant differences(0.05 level), therefore H1(Tourist Satisfaction level should be different from Demographic traits) are supported. According to the multiple Regressions analysis and AMOS, H2(Tourist Satisfaction positively influences on revisit intention), H3(Tourist Satisfaction positively influences on word of mouth), H4(Evaluation Attributes of cultural tourism festivals influences on Tourist Satisfaction), and H5(Tourist Satisfaction positively influences on Behavioral Intention) are also supported. As the conclusion of this study are as following: First, there were differences in satisfaction levels in accordance with the demographic information of visitors. Not all visitors had the same degree of satisfaction with their cultural tourism festival experience. Therefore it is necessary to understand the satisfaction of tourists if the experiences that are provided are to meet their expectations. So, in making festival plans, the organizer should consider the demographic variables in explaining and segmenting visitors to cultural tourism festival. Second, satisfaction with attributes of evaluation cultural tourism festivals had a significant direct impact on visitors' intention to revisit such festivals and the word of mouth publicity they shared. The results indicated that visitor satisfaction is a significant antecedent of their intention to revisit such festivals. Festival organizers should strive to forge long-term relationships with the visitors. In addition, it is also necessary to understand how the intention to revisit a festival changes over time and identify the critical satisfaction factors. Third, it is confirmed that behavioral intention was enhanced by satisfaction. The strong link between satisfaction and behavioral intentions of visitors areensured by high quality advertisement and guides, programs, food and souvenirs, and convenient facilities. Thus, examining revisit intention from a time viewpoint may be of a great significance for both practical and theoretical reasons. Additionally, festival organizers should give special attention to visitor satisfaction, as satisfied visitors are more likely to return sooner. The findings of this research have several practical implications for the festivals managers. The promotion of cultural festivals should be based on the understanding of tourist satisfaction for the long- term success of tourism. And this study can help managers carry out this task in a more informed and strategic manner by examining the effects of demographic traits on the level of tourist satisfaction and the behavioral intention. In other words, differentiated marketing strategies should be stressed and executed by relevant parties. The limitations of this study are as follows; the results of this study cannot be generalized to other cultural tourism festivals because we have not explored the many different kinds of festivals. A future study should be a comparative analysis of other festivals of different visitor segments. Also, further efforts should be directed toward developing more comprehensive temporal models that can explain behavioral intentions of tourists.

  • PDF

A Study on Developing a VKOSPI Forecasting Model via GARCH Class Models for Intelligent Volatility Trading Systems (지능형 변동성트레이딩시스템개발을 위한 GARCH 모형을 통한 VKOSPI 예측모형 개발에 관한 연구)

  • Kim, Sun-Woong
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.2
    • /
    • pp.19-32
    • /
    • 2010
  • Volatility plays a central role in both academic and practical applications, especially in pricing financial derivative products and trading volatility strategies. This study presents a novel mechanism based on generalized autoregressive conditional heteroskedasticity (GARCH) models that is able to enhance the performance of intelligent volatility trading systems by predicting Korean stock market volatility more accurately. In particular, we embedded the concept of the volatility asymmetry documented widely in the literature into our model. The newly developed Korean stock market volatility index of KOSPI 200, VKOSPI, is used as a volatility proxy. It is the price of a linear portfolio of the KOSPI 200 index options and measures the effect of the expectations of dealers and option traders on stock market volatility for 30 calendar days. The KOSPI 200 index options market started in 1997 and has become the most actively traded market in the world. Its trading volume is more than 10 million contracts a day and records the highest of all the stock index option markets. Therefore, analyzing the VKOSPI has great importance in understanding volatility inherent in option prices and can afford some trading ideas for futures and option dealers. Use of the VKOSPI as volatility proxy avoids statistical estimation problems associated with other measures of volatility since the VKOSPI is model-free expected volatility of market participants calculated directly from the transacted option prices. This study estimates the symmetric and asymmetric GARCH models for the KOSPI 200 index from January 2003 to December 2006 by the maximum likelihood procedure. Asymmetric GARCH models include GJR-GARCH model of Glosten, Jagannathan and Runke, exponential GARCH model of Nelson and power autoregressive conditional heteroskedasticity (ARCH) of Ding, Granger and Engle. Symmetric GARCH model indicates basic GARCH (1, 1). Tomorrow's forecasted value and change direction of stock market volatility are obtained by recursive GARCH specifications from January 2007 to December 2009 and are compared with the VKOSPI. Empirical results indicate that negative unanticipated returns increase volatility more than positive return shocks of equal magnitude decrease volatility, indicating the existence of volatility asymmetry in the Korean stock market. The point value and change direction of tomorrow VKOSPI are estimated and forecasted by GARCH models. Volatility trading system is developed using the forecasted change direction of the VKOSPI, that is, if tomorrow VKOSPI is expected to rise, a long straddle or strangle position is established. A short straddle or strangle position is taken if VKOSPI is expected to fall tomorrow. Total profit is calculated as the cumulative sum of the VKOSPI percentage change. If forecasted direction is correct, the absolute value of the VKOSPI percentage changes is added to trading profit. It is subtracted from the trading profit if forecasted direction is not correct. For the in-sample period, the power ARCH model best fits in a statistical metric, Mean Squared Prediction Error (MSPE), and the exponential GARCH model shows the highest Mean Correct Prediction (MCP). The power ARCH model best fits also for the out-of-sample period and provides the highest probability for the VKOSPI change direction tomorrow. Generally, the power ARCH model shows the best fit for the VKOSPI. All the GARCH models provide trading profits for volatility trading system and the exponential GARCH model shows the best performance, annual profit of 197.56%, during the in-sample period. The GARCH models present trading profits during the out-of-sample period except for the exponential GARCH model. During the out-of-sample period, the power ARCH model shows the largest annual trading profit of 38%. The volatility clustering and asymmetry found in this research are the reflection of volatility non-linearity. This further suggests that combining the asymmetric GARCH models and artificial neural networks can significantly enhance the performance of the suggested volatility trading system, since artificial neural networks have been shown to effectively model nonlinear relationships.

Development of the Accident Prediction Model for Enlisted Men through an Integrated Approach to Datamining and Textmining (데이터 마이닝과 텍스트 마이닝의 통합적 접근을 통한 병사 사고예측 모델 개발)

  • Yoon, Seungjin;Kim, Suhwan;Shin, Kyungshik
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.3
    • /
    • pp.1-17
    • /
    • 2015
  • In this paper, we report what we have observed with regards to a prediction model for the military based on enlisted men's internal(cumulative records) and external data(SNS data). This work is significant in the military's efforts to supervise them. In spite of their effort, many commanders have failed to prevent accidents by their subordinates. One of the important duties of officers' work is to take care of their subordinates in prevention unexpected accidents. However, it is hard to prevent accidents so we must attempt to determine a proper method. Our motivation for presenting this paper is to mate it possible to predict accidents using enlisted men's internal and external data. The biggest issue facing the military is the occurrence of accidents by enlisted men related to maladjustment and the relaxation of military discipline. The core method of preventing accidents by soldiers is to identify problems and manage them quickly. Commanders predict accidents by interviewing their soldiers and observing their surroundings. It requires considerable time and effort and results in a significant difference depending on the capabilities of the commanders. In this paper, we seek to predict accidents with objective data which can easily be obtained. Recently, records of enlisted men as well as SNS communication between commanders and soldiers, make it possible to predict and prevent accidents. This paper concerns the application of data mining to identify their interests, predict accidents and make use of internal and external data (SNS). We propose both a topic analysis and decision tree method. The study is conducted in two steps. First, topic analysis is conducted through the SNS of enlisted men. Second, the decision tree method is used to analyze the internal data with the results of the first analysis. The dependent variable for these analysis is the presence of any accidents. In order to analyze their SNS, we require tools such as text mining and topic analysis. We used SAS Enterprise Miner 12.1, which provides a text miner module. Our approach for finding their interests is composed of three main phases; collecting, topic analysis, and converting topic analysis results into points for using independent variables. In the first phase, we collect enlisted men's SNS data by commender's ID. After gathering unstructured SNS data, the topic analysis phase extracts issues from them. For simplicity, 5 topics(vacation, friends, stress, training, and sports) are extracted from 20,000 articles. In the third phase, using these 5 topics, we quantify them as personal points. After quantifying their topic, we include these results in independent variables which are composed of 15 internal data sets. Then, we make two decision trees. The first tree is composed of their internal data only. The second tree is composed of their external data(SNS) as well as their internal data. After that, we compare the results of misclassification from SAS E-miner. The first model's misclassification is 12.1%. On the other hand, second model's misclassification is 7.8%. This method predicts accidents with an accuracy of approximately 92%. The gap of the two models is 4.3%. Finally, we test if the difference between them is meaningful or not, using the McNemar test. The result of test is considered relevant.(p-value : 0.0003) This study has two limitations. First, the results of the experiments cannot be generalized, mainly because the experiment is limited to a small number of enlisted men's data. Additionally, various independent variables used in the decision tree model are used as categorical variables instead of continuous variables. So it suffers a loss of information. In spite of extensive efforts to provide prediction models for the military, commanders' predictions are accurate only when they have sufficient data about their subordinates. Our proposed methodology can provide support to decision-making in the military. This study is expected to contribute to the prevention of accidents in the military based on scientific analysis of enlisted men and proper management of them.

Self-optimizing feature selection algorithm for enhancing campaign effectiveness (캠페인 효과 제고를 위한 자기 최적화 변수 선택 알고리즘)

  • Seo, Jeoung-soo;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.4
    • /
    • pp.173-198
    • /
    • 2020
  • For a long time, many studies have been conducted on predicting the success of campaigns for customers in academia, and prediction models applying various techniques are still being studied. Recently, as campaign channels have been expanded in various ways due to the rapid revitalization of online, various types of campaigns are being carried out by companies at a level that cannot be compared to the past. However, customers tend to perceive it as spam as the fatigue of campaigns due to duplicate exposure increases. Also, from a corporate standpoint, there is a problem that the effectiveness of the campaign itself is decreasing, such as increasing the cost of investing in the campaign, which leads to the low actual campaign success rate. Accordingly, various studies are ongoing to improve the effectiveness of the campaign in practice. This campaign system has the ultimate purpose to increase the success rate of various campaigns by collecting and analyzing various data related to customers and using them for campaigns. In particular, recent attempts to make various predictions related to the response of campaigns using machine learning have been made. It is very important to select appropriate features due to the various features of campaign data. If all of the input data are used in the process of classifying a large amount of data, it takes a lot of learning time as the classification class expands, so the minimum input data set must be extracted and used from the entire data. In addition, when a trained model is generated by using too many features, prediction accuracy may be degraded due to overfitting or correlation between features. Therefore, in order to improve accuracy, a feature selection technique that removes features close to noise should be applied, and feature selection is a necessary process in order to analyze a high-dimensional data set. Among the greedy algorithms, SFS (Sequential Forward Selection), SBS (Sequential Backward Selection), SFFS (Sequential Floating Forward Selection), etc. are widely used as traditional feature selection techniques. It is also true that if there are many risks and many features, there is a limitation in that the performance for classification prediction is poor and it takes a lot of learning time. Therefore, in this study, we propose an improved feature selection algorithm to enhance the effectiveness of the existing campaign. The purpose of this study is to improve the existing SFFS sequential method in the process of searching for feature subsets that are the basis for improving machine learning model performance using statistical characteristics of the data to be processed in the campaign system. Through this, features that have a lot of influence on performance are first derived, features that have a negative effect are removed, and then the sequential method is applied to increase the efficiency for search performance and to apply an improved algorithm to enable generalized prediction. Through this, it was confirmed that the proposed model showed better search and prediction performance than the traditional greed algorithm. Compared with the original data set, greed algorithm, genetic algorithm (GA), and recursive feature elimination (RFE), the campaign success prediction was higher. In addition, when performing campaign success prediction, the improved feature selection algorithm was found to be helpful in analyzing and interpreting the prediction results by providing the importance of the derived features. This is important features such as age, customer rating, and sales, which were previously known statistically. Unlike the previous campaign planners, features such as the combined product name, average 3-month data consumption rate, and the last 3-month wireless data usage were unexpectedly selected as important features for the campaign response, which they rarely used to select campaign targets. It was confirmed that base attributes can also be very important features depending on the type of campaign. Through this, it is possible to analyze and understand the important characteristics of each campaign type.

Evaluation of Varietal Difference and Environmental Variation for Some Characters Related to Source and Sink in the Rice Plants (벼의 Source 및 Sink형질의 품종간차이와 환경변이의 평가)

  • Choi, Hae-Chun;Kwon, Yong-Woong
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.30 no.4
    • /
    • pp.460-470
    • /
    • 1985
  • Experiments were carried out to evaluate the standard gravity in determining potential kernel size and to determine the effective sampling way by analyzing intra - and inter - plant variations for some source and sink characters using eleven semi-dwarf indica and three japonica cultivars including four semi-dwarf indica nearisogenic lines. Also, additional experiments were conducted to understand yearly variation and variety x year interaction effects for ten characters related to source and sink and to characterize the varietal difference of pre- and post-heading self-competition employing three parental varieties and their F$\sub$5/ progenies in 1982 and 1983. It is desirable to determine the potential kernel size by average kernel wight of rice grains showing above 1.15 specific gravity. There was significant difference in leaf area per tiller, spikelets and sink capacity per panicle among vigorous, intermediate and inferior tillers classified by differentiated order and vigorousness. Although it was difficult to find out any significant difference in grain-fill ratio, ratio of perfectly ripened grain, potential kernel size and sink/source ratio between vigorous and intermediate tillers, there was big difference between them and inferior one. The coefficients of variation within each tiller-group for some characters related to source and sink were larger with the order of vigorous tillers < intermediate one '||'&'||'lt; inferior one, and the average heritability of all characters, evaluated by the ratio of varietal variance (equation omitted) to total variance (equation omitted), were higher with the order of inferior tillers '||'&'||'lt; intemediate one '||'&'||'lt; superior one. Therefore, it is desirable to sample the vigorous tillers to represent the varietal difference of these traits. '82-'83 year variations of three parental cultivars were significant for all traits except for leaf area/tiller, panicles/hill, leaf area index and rough rice yield. The characters showing highly significant variance of variety x year interaction were growth duration from transplanting to heading, leaf area/tiller, sink/source ratio, sink capacity/panicle and grain yield. Generalized yearly response of three parental varieties (Suweon 264, Raegyeong, IR1317-70-l) and their F$\sub$5/ progenies on the 1st and 2nd principal components extracted from ten source and sink characters generally exhibited reduction in both source and sink. However, there were diverse variety x year interactions such as progenies showing similar reaction with their parents and intermediate or recombinational yearly response with little or considerable yearly movement on the four-dimensional planes of the two upper principal components between 1982 and 1983. Sink characters revealing highly significant border effect were grain-fill ratio, spikelets and sink capacity per panicle. Among them the latter two especially showed significant variety x border effect interaction. Self-competition characterized by relative weakness of inside plant's sink characters compared to the border one was more severe during the reproductive stage before heading than maturing stage. Though the larger sink capacity per panicle generally disclosed the severer self-competition, some lines (like Suweon 264) revealed severe self-competition with small sink capacity while a few others showed tender self-competition in spite of big sink capacity per panicle.

  • PDF

Estimation of GARCH Models and Performance Analysis of Volatility Trading System using Support Vector Regression (Support Vector Regression을 이용한 GARCH 모형의 추정과 투자전략의 성과분석)

  • Kim, Sun Woong;Choi, Heung Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.2
    • /
    • pp.107-122
    • /
    • 2017
  • Volatility in the stock market returns is a measure of investment risk. It plays a central role in portfolio optimization, asset pricing and risk management as well as most theoretical financial models. Engle(1982) presented a pioneering paper on the stock market volatility that explains the time-variant characteristics embedded in the stock market return volatility. His model, Autoregressive Conditional Heteroscedasticity (ARCH), was generalized by Bollerslev(1986) as GARCH models. Empirical studies have shown that GARCH models describes well the fat-tailed return distributions and volatility clustering phenomenon appearing in stock prices. The parameters of the GARCH models are generally estimated by the maximum likelihood estimation (MLE) based on the standard normal density. But, since 1987 Black Monday, the stock market prices have become very complex and shown a lot of noisy terms. Recent studies start to apply artificial intelligent approach in estimating the GARCH parameters as a substitute for the MLE. The paper presents SVR-based GARCH process and compares with MLE-based GARCH process to estimate the parameters of GARCH models which are known to well forecast stock market volatility. Kernel functions used in SVR estimation process are linear, polynomial and radial. We analyzed the suggested models with KOSPI 200 Index. This index is constituted by 200 blue chip stocks listed in the Korea Exchange. We sampled KOSPI 200 daily closing values from 2010 to 2015. Sample observations are 1487 days. We used 1187 days to train the suggested GARCH models and the remaining 300 days were used as testing data. First, symmetric and asymmetric GARCH models are estimated by MLE. We forecasted KOSPI 200 Index return volatility and the statistical metric MSE shows better results for the asymmetric GARCH models such as E-GARCH or GJR-GARCH. This is consistent with the documented non-normal return distribution characteristics with fat-tail and leptokurtosis. Compared with MLE estimation process, SVR-based GARCH models outperform the MLE methodology in KOSPI 200 Index return volatility forecasting. Polynomial kernel function shows exceptionally lower forecasting accuracy. We suggested Intelligent Volatility Trading System (IVTS) that utilizes the forecasted volatility results. IVTS entry rules are as follows. If forecasted tomorrow volatility will increase then buy volatility today. If forecasted tomorrow volatility will decrease then sell volatility today. If forecasted volatility direction does not change we hold the existing buy or sell positions. IVTS is assumed to buy and sell historical volatility values. This is somewhat unreal because we cannot trade historical volatility values themselves. But our simulation results are meaningful since the Korea Exchange introduced volatility futures contract that traders can trade since November 2014. The trading systems with SVR-based GARCH models show higher returns than MLE-based GARCH in the testing period. And trading profitable percentages of MLE-based GARCH IVTS models range from 47.5% to 50.0%, trading profitable percentages of SVR-based GARCH IVTS models range from 51.8% to 59.7%. MLE-based symmetric S-GARCH shows +150.2% return and SVR-based symmetric S-GARCH shows +526.4% return. MLE-based asymmetric E-GARCH shows -72% return and SVR-based asymmetric E-GARCH shows +245.6% return. MLE-based asymmetric GJR-GARCH shows -98.7% return and SVR-based asymmetric GJR-GARCH shows +126.3% return. Linear kernel function shows higher trading returns than radial kernel function. Best performance of SVR-based IVTS is +526.4% and that of MLE-based IVTS is +150.2%. SVR-based GARCH IVTS shows higher trading frequency. This study has some limitations. Our models are solely based on SVR. Other artificial intelligence models are needed to search for better performance. We do not consider costs incurred in the trading process including brokerage commissions and slippage costs. IVTS trading performance is unreal since we use historical volatility values as trading objects. The exact forecasting of stock market volatility is essential in the real trading as well as asset pricing models. Further studies on other machine learning-based GARCH models can give better information for the stock market investors.