• Title/Summary/Keyword: techniques

Search Result 34,597, Processing Time 0.068 seconds

Studios on Lipids in Fresh-Water Fishes 5. Distribution of Lipid Components in Various Tissues of Carp, Cyprynus carpio (담수어의 지질에 관한 연구 5. 잉어(Cyprynus carpio)의 부위별 지질성분의 분포)

  • CHOI Jin-Ho;RO Jae-Il;BYUN Dae-Seok;PYEUN Jae-Hyeung
    • Korean Journal of Fisheries and Aquatic Sciences
    • /
    • v.18 no.2
    • /
    • pp.149-156
    • /
    • 1985
  • Distribution of lipid components in the tissue of meat, skin and viscera from carp(Cyprynus carpio) was analyzed using the techniques of column chromatography, thin layer chromatography and gas liquid chromatography according to the previous report(Choi, et al., 1984). Lipid content was varied by the portion such as $3.88\%$ in meat (free lipid, $2.47\%$ ; bound lipid, $1.41\%$), $8.02\%$ in skin(free lipid, $5.65\%$ ; bound lipid, $2.37\%$) and $6.18\%$ in viscera (free lipid, $3.54\%$ ; bound lipid, $2.64\%$). In the all portions of the body, free lipid was composed of $68\%\;to\;92\%$ in neutral lipid, $3\%\;to\;6\%$ in glycolipid and $4\%\;to\;18\%$ in phospholipid whereas bound lipid was composed of $8\%\;to\;20\%$ in neutral lipid, $2\%\;to\;7\%$ in glycolipid and $47\%\;to\;62\%$ in phospholipid. The free lipids of the tissues on the each portion were mostly represented by triglycerides and some diglycerides, but free lipids in viscera contained considerable amounts of free fatty acids. The bound lipids, on the other hand, commonly comprised appreciable amounts of esterified sterol and hydrocarbon, and triglycerides. The phospholipid was mainly consisted of phosphatidyl choline, phosphatidyl ethanolamine and phosphatidyl serine in the both free and bound lipids, and much more phosphatidyl choline in the bound lipid. The predominant fatty acids of free and bound lipids were $C_{16:0},\;C_{18:0},\;C_{20:4},\;C_{22:6}\;and\;C_{18:2}$ acids in polar lipids, and $C_{16:0},\;C_{16:1},\;C_{18:0},\;C_{18:1}\;and\;C_{18:2}$ acids in non-polar lipids, whereas those of neutral lipids were $C_{14:0}(2.54{\sim}6.98\%),\;C_{16:0}(11.20{\sim}21.13\%)$ and $C_{18:0}(1.58{\sim}12.76\%)$ of saturated acids, $C_{16:1}(7.06{\sim}20.70\%),\;C_{18:1}(21.68{\sim}30.50\%)$ and $C_{20:1}(1.76{\sim}6.27\%)$ of monoenoic acids, and $C_{18:2}(4.50{\sim}6.89\%),\;C_{20:4}(1.52{\sim}4.29\%)$ and $C_{22:6}(0.73{\sim}6.62\%)$, respectively. In conclusion, the fatty acid compositions revealed apparent differences between the free lipid and bound lipids in the tissues of body.

  • PDF

A Study of Factors Associated with Software Developers Job Turnover (데이터마이닝을 활용한 소프트웨어 개발인력의 업무 지속수행의도 결정요인 분석)

  • Jeon, In-Ho;Park, Sun W.;Park, Yoon-Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.191-204
    • /
    • 2015
  • According to the '2013 Performance Assessment Report on the Financial Program' from the National Assembly Budget Office, the unfilled recruitment ratio of Software(SW) Developers in South Korea was 25% in the 2012 fiscal year. Moreover, the unfilled recruitment ratio of highly-qualified SW developers reaches almost 80%. This phenomenon is intensified in small and medium enterprises consisting of less than 300 employees. Young job-seekers in South Korea are increasingly avoiding becoming a SW developer and even the current SW developers want to change careers, which hinders the national development of IT industries. The Korean government has recently realized the problem and implemented policies to foster young SW developers. Due to this effort, it has become easier to find young SW developers at the beginning-level. However, it is still hard to recruit highly-qualified SW developers for many IT companies. This is because in order to become a SW developing expert, having a long term experiences are important. Thus, improving job continuity intentions of current SW developers is more important than fostering new SW developers. Therefore, this study surveyed the job continuity intentions of SW developers and analyzed the factors associated with them. As a method, we carried out a survey from September 2014 to October 2014, which was targeted on 130 SW developers who were working in IT industries in South Korea. We gathered the demographic information and characteristics of the respondents, work environments of a SW industry, and social positions for SW developers. Afterward, a regression analysis and a decision tree method were performed to analyze the data. These two methods are widely used data mining techniques, which have explanation ability and are mutually complementary. We first performed a linear regression method to find the important factors assaociated with a job continuity intension of SW developers. The result showed that an 'expected age' to work as a SW developer were the most significant factor associated with the job continuity intention. We supposed that the major cause of this phenomenon is the structural problem of IT industries in South Korea, which requires SW developers to change the work field from developing area to management as they are promoted. Also, a 'motivation' to become a SW developer and a 'personality (introverted tendency)' of a SW developer are highly importantly factors associated with the job continuity intention. Next, the decision tree method was performed to extract the characteristics of highly motivated developers and the low motivated ones. We used well-known C4.5 algorithm for decision tree analysis. The results showed that 'motivation', 'personality', and 'expected age' were also important factors influencing the job continuity intentions, which was similar to the results of the regression analysis. In addition to that, the 'ability to learn' new technology was a crucial factor for the decision rules of job continuity. In other words, a person with high ability to learn new technology tends to work as a SW developer for a longer period of time. The decision rule also showed that a 'social position' of SW developers and a 'prospect' of SW industry were minor factors influencing job continuity intensions. On the other hand, 'type of an employment (regular position/ non-regular position)' and 'type of company (ordering company/ service providing company)' did not affect the job continuity intension in both methods. In this research, we demonstrated the job continuity intentions of SW developers, who were actually working at IT companies in South Korea, and we analyzed the factors associated with them. These results can be used for human resource management in many IT companies when recruiting or fostering highly-qualified SW experts. It can also help to build SW developer fostering policy and to solve the problem of unfilled recruitment of SW Developers in South Korea.

Discovering Promising Convergence Technologies Using Network Analysis of Maturity and Dependency of Technology (기술 성숙도 및 의존도의 네트워크 분석을 통한 유망 융합 기술 발굴 방법론)

  • Choi, Hochang;Kwahk, Kee-Young;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.101-124
    • /
    • 2018
  • Recently, most of the technologies have been developed in various forms through the advancement of single technology or interaction with other technologies. Particularly, these technologies have the characteristic of the convergence caused by the interaction between two or more techniques. In addition, efforts in responding to technological changes by advance are continuously increasing through forecasting promising convergence technologies that will emerge in the near future. According to this phenomenon, many researchers are attempting to perform various analyses about forecasting promising convergence technologies. A convergence technology has characteristics of various technologies according to the principle of generation. Therefore, forecasting promising convergence technologies is much more difficult than forecasting general technologies with high growth potential. Nevertheless, some achievements have been confirmed in an attempt to forecasting promising technologies using big data analysis and social network analysis. Studies of convergence technology through data analysis are actively conducted with the theme of discovering new convergence technologies and analyzing their trends. According that, information about new convergence technologies is being provided more abundantly than in the past. However, existing methods in analyzing convergence technology have some limitations. Firstly, most studies deal with convergence technology analyze data through predefined technology classifications. The technologies appearing recently tend to have characteristics of convergence and thus consist of technologies from various fields. In other words, the new convergence technologies may not belong to the defined classification. Therefore, the existing method does not properly reflect the dynamic change of the convergence phenomenon. Secondly, in order to forecast the promising convergence technologies, most of the existing analysis method use the general purpose indicators in process. This method does not fully utilize the specificity of convergence phenomenon. The new convergence technology is highly dependent on the existing technology, which is the origin of that technology. Based on that, it can grow into the independent field or disappear rapidly, according to the change of the dependent technology. In the existing analysis, the potential growth of convergence technology is judged through the traditional indicators designed from the general purpose. However, these indicators do not reflect the principle of convergence. In other words, these indicators do not reflect the characteristics of convergence technology, which brings the meaning of new technologies emerge through two or more mature technologies and grown technologies affect the creation of another technology. Thirdly, previous studies do not provide objective methods for evaluating the accuracy of models in forecasting promising convergence technologies. In the studies of convergence technology, the subject of forecasting promising technologies was relatively insufficient due to the complexity of the field. Therefore, it is difficult to find a method to evaluate the accuracy of the model that forecasting promising convergence technologies. In order to activate the field of forecasting promising convergence technology, it is important to establish a method for objectively verifying and evaluating the accuracy of the model proposed by each study. To overcome these limitations, we propose a new method for analysis of convergence technologies. First of all, through topic modeling, we derive a new technology classification in terms of text content. It reflects the dynamic change of the actual technology market, not the existing fixed classification standard. In addition, we identify the influence relationships between technologies through the topic correspondence weights of each document, and structuralize them into a network. In addition, we devise a centrality indicator (PGC, potential growth centrality) to forecast the future growth of technology by utilizing the centrality information of each technology. It reflects the convergence characteristics of each technology, according to technology maturity and interdependence between technologies. Along with this, we propose a method to evaluate the accuracy of forecasting model by measuring the growth rate of promising technology. It is based on the variation of potential growth centrality by period. In this paper, we conduct experiments with 13,477 patent documents dealing with technical contents to evaluate the performance and practical applicability of the proposed method. As a result, it is confirmed that the forecast model based on a centrality indicator of the proposed method has a maximum forecast accuracy of about 2.88 times higher than the accuracy of the forecast model based on the currently used network indicators.

A Time Series Graph based Convolutional Neural Network Model for Effective Input Variable Pattern Learning : Application to the Prediction of Stock Market (효과적인 입력변수 패턴 학습을 위한 시계열 그래프 기반 합성곱 신경망 모형: 주식시장 예측에의 응용)

  • Lee, Mo-Se;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.167-181
    • /
    • 2018
  • Over the past decade, deep learning has been in spotlight among various machine learning algorithms. In particular, CNN(Convolutional Neural Network), which is known as the effective solution for recognizing and classifying images or voices, has been popularly applied to classification and prediction problems. In this study, we investigate the way to apply CNN in business problem solving. Specifically, this study propose to apply CNN to stock market prediction, one of the most challenging tasks in the machine learning research. As mentioned, CNN has strength in interpreting images. Thus, the model proposed in this study adopts CNN as the binary classifier that predicts stock market direction (upward or downward) by using time series graphs as its inputs. That is, our proposal is to build a machine learning algorithm that mimics an experts called 'technical analysts' who examine the graph of past price movement, and predict future financial price movements. Our proposed model named 'CNN-FG(Convolutional Neural Network using Fluctuation Graph)' consists of five steps. In the first step, it divides the dataset into the intervals of 5 days. And then, it creates time series graphs for the divided dataset in step 2. The size of the image in which the graph is drawn is $40(pixels){\times}40(pixels)$, and the graph of each independent variable was drawn using different colors. In step 3, the model converts the images into the matrices. Each image is converted into the combination of three matrices in order to express the value of the color using R(red), G(green), and B(blue) scale. In the next step, it splits the dataset of the graph images into training and validation datasets. We used 80% of the total dataset as the training dataset, and the remaining 20% as the validation dataset. And then, CNN classifiers are trained using the images of training dataset in the final step. Regarding the parameters of CNN-FG, we adopted two convolution filters ($5{\times}5{\times}6$ and $5{\times}5{\times}9$) in the convolution layer. In the pooling layer, $2{\times}2$ max pooling filter was used. The numbers of the nodes in two hidden layers were set to, respectively, 900 and 32, and the number of the nodes in the output layer was set to 2(one is for the prediction of upward trend, and the other one is for downward trend). Activation functions for the convolution layer and the hidden layer were set to ReLU(Rectified Linear Unit), and one for the output layer set to Softmax function. To validate our model - CNN-FG, we applied it to the prediction of KOSPI200 for 2,026 days in eight years (from 2009 to 2016). To match the proportions of the two groups in the independent variable (i.e. tomorrow's stock market movement), we selected 1,950 samples by applying random sampling. Finally, we built the training dataset using 80% of the total dataset (1,560 samples), and the validation dataset using 20% (390 samples). The dependent variables of the experimental dataset included twelve technical indicators popularly been used in the previous studies. They include Stochastic %K, Stochastic %D, Momentum, ROC(rate of change), LW %R(Larry William's %R), A/D oscillator(accumulation/distribution oscillator), OSCP(price oscillator), CCI(commodity channel index), and so on. To confirm the superiority of CNN-FG, we compared its prediction accuracy with the ones of other classification models. Experimental results showed that CNN-FG outperforms LOGIT(logistic regression), ANN(artificial neural network), and SVM(support vector machine) with the statistical significance. These empirical results imply that converting time series business data into graphs and building CNN-based classification models using these graphs can be effective from the perspective of prediction accuracy. Thus, this paper sheds a light on how to apply deep learning techniques to the domain of business problem solving.

Business Application of Convolutional Neural Networks for Apparel Classification Using Runway Image (합성곱 신경망의 비지니스 응용: 런웨이 이미지를 사용한 의류 분류를 중심으로)

  • Seo, Yian;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.1-19
    • /
    • 2018
  • Large amount of data is now available for research and business sectors to extract knowledge from it. This data can be in the form of unstructured data such as audio, text, and image data and can be analyzed by deep learning methodology. Deep learning is now widely used for various estimation, classification, and prediction problems. Especially, fashion business adopts deep learning techniques for apparel recognition, apparel search and retrieval engine, and automatic product recommendation. The core model of these applications is the image classification using Convolutional Neural Networks (CNN). CNN is made up of neurons which learn parameters such as weights while inputs come through and reach outputs. CNN has layer structure which is best suited for image classification as it is comprised of convolutional layer for generating feature maps, pooling layer for reducing the dimensionality of feature maps, and fully-connected layer for classifying the extracted features. However, most of the classification models have been trained using online product image, which is taken under controlled situation such as apparel image itself or professional model wearing apparel. This image may not be an effective way to train the classification model considering the situation when one might want to classify street fashion image or walking image, which is taken in uncontrolled situation and involves people's movement and unexpected pose. Therefore, we propose to train the model with runway apparel image dataset which captures mobility. This will allow the classification model to be trained with far more variable data and enhance the adaptation with diverse query image. To achieve both convergence and generalization of the model, we apply Transfer Learning on our training network. As Transfer Learning in CNN is composed of pre-training and fine-tuning stages, we divide the training step into two. First, we pre-train our architecture with large-scale dataset, ImageNet dataset, which consists of 1.2 million images with 1000 categories including animals, plants, activities, materials, instrumentations, scenes, and foods. We use GoogLeNet for our main architecture as it has achieved great accuracy with efficiency in ImageNet Large Scale Visual Recognition Challenge (ILSVRC). Second, we fine-tune the network with our own runway image dataset. For the runway image dataset, we could not find any previously and publicly made dataset, so we collect the dataset from Google Image Search attaining 2426 images of 32 major fashion brands including Anna Molinari, Balenciaga, Balmain, Brioni, Burberry, Celine, Chanel, Chloe, Christian Dior, Cividini, Dolce and Gabbana, Emilio Pucci, Ermenegildo, Fendi, Giuliana Teso, Gucci, Issey Miyake, Kenzo, Leonard, Louis Vuitton, Marc Jacobs, Marni, Max Mara, Missoni, Moschino, Ralph Lauren, Roberto Cavalli, Sonia Rykiel, Stella McCartney, Valentino, Versace, and Yve Saint Laurent. We perform 10-folded experiments to consider the random generation of training data, and our proposed model has achieved accuracy of 67.2% on final test. Our research suggests several advantages over previous related studies as to our best knowledge, there haven't been any previous studies which trained the network for apparel image classification based on runway image dataset. We suggest the idea of training model with image capturing all the possible postures, which is denoted as mobility, by using our own runway apparel image dataset. Moreover, by applying Transfer Learning and using checkpoint and parameters provided by Tensorflow Slim, we could save time spent on training the classification model as taking 6 minutes per experiment to train the classifier. This model can be used in many business applications where the query image can be runway image, product image, or street fashion image. To be specific, runway query image can be used for mobile application service during fashion week to facilitate brand search, street style query image can be classified during fashion editorial task to classify and label the brand or style, and website query image can be processed by e-commerce multi-complex service providing item information or recommending similar item.

Postoperative Radiation Therapy for Chest Wall Invading pT3N0 Non-small Cell Lung Cancer: Elective Lymphatic Irradiation May Not Be Necessary (흉벽을 침범한 pT3N0 비소세포폐암 환자에서 수술 후 방사선치료)

  • Park, Young-Je;Ahn, Yong-Chan;Lim, Do-Hoon;Park, Won;Kim, Kwan-Min;Kim, Jhingook;Shim, Young-Mog;Kim, Kyoung-Ju;Lee, Jeung-Eun;Kang, Min-Kyu;Nam, Hee-Rim;Huh, Seung-Jae
    • Radiation Oncology Journal
    • /
    • v.21 no.4
    • /
    • pp.253-260
    • /
    • 2003
  • Purpose: No general consensus has been reached regarding the necessity of postoperative radiation therapy (PORT) and the optimal techniques of its application for patients with chest wall invasion (pT3cw) and node negative (NO) non-small cell lung cancer (NSCLC). We retrospectively analyzed the PT3cwN0 NSCLC patients who received PORT because of presumed inadequate resection margin on surgical findings. Materials and Methods: From Aug. 1994 till June 2000, 21 pT3cwN0 NSCLC patients received PORT at Samsung Medical Center; all of whom underwent curative on-bloc resection of the primary tumor plus the chest wall and regional lymph node dissection. PORT was typically stalled 3 to 4 weeks after operation using 6 or 10 MV X-rays from a linear accelerator. The radiation target volume was confined to the tumor bed plus the immediate adjacent tissue, and no regional lymphatics were included. The planned radiation dose was 54 Gy by conventional fractionation schedule. The survival rates were calculated and the failure patterns analyzed. Results: Overall survival, disease-free survival, loco-regional recurrence-free survival, and distant metastases-free survival rates at 5 years were 38.8$\%$, 45.5$\%$, 90.2$\%$, and 48.1$\%$, respectively. Eleven patients experienced treatment failure: six with distant metastases, three with intra-thoracic failures, and two with combined distant and intra-thoracic failures. Among the five patients with intra-thoracic failures, two had pleural seeding, two had in-field local failures, and only one had regional lymphatic failure in the mediastinum. No patients suffered from acute and late radiation side effects of RTOG grade 3 or higher. Conclusion: The strategy of adding PORT to surgery to improve the probability, not only of local control but also of survival, was justified, considering that local control was the most important component in the successful treatment of pT3cw NSCLC patients, especially when the resection margin was not adequate. The incidence and the severity of the acute and late side effects of PORT were markedly reduced, which contributed to improving the patients' qualify of life both during and after PORT, without increasing the risk of regional failures by eliminating the regional lymphatics from the radiation target volume.

Studies on the Breeding of Cold Hardiness and Technique of Overwintering Cultivation in Citrus (감귤(柑橘)의 내한성(耐寒性) 품종육성(品種育成) 및 내한(耐寒) 재배기술(栽培技術)에 관(關)한 연구(硏究))

  • Kim, Chi-moon;Song, Ho-kyung;Kim, Chung-suk
    • Korean Journal of Agricultural Science
    • /
    • v.4 no.2
    • /
    • pp.126-140
    • /
    • 1977
  • Present studies were carried out for breeding cold resistant clones of Citrus, improving overwintering techniques of Citrus in Jeju island as well as other southern region, Result obtained were as follows 1. In the vinyl house covered with two sheets of straw mat, 12 indivuals were found as non-injury and 15 were slightly injured by leaf freeging test at $-9^{\circ}C$ for 2 hours treatment. 2. In the condition of vinyl house covered with straw mat and viny film mulching and heat-in by sun-light, the inside temperature of vinyl house were not lowered below $-3^{\circ}C$ and the ground temperatura in vinyl house keeps above $0^{\circ}C$ during winter though outdoor temperature were lower by $-15^{\circ}C$(Daejeon area). 3. The vinyl tunnel inside the vinyl house and vinyl film mulching on ground position showed greater effectiveness for preventing heat loss from house but there were no significant difference between the color of vinyl film covered the tunnel. 4. In the vertical distribution of maximum temperature in vinyl house, the upper space was slightly higher than the lower position at high temperature condition, while minimum temperature was distributed as higher as the middle position, ground surface and upper position in order at low temperature condition 5. In the horizontal distribution of temperature in vinyl house, ground and surface-temperature of north side was lower than the other sides, and citrus planted within 30cm from north side wall died by cold injury and in the other side near wall appeared slight symptom of cold injury. 6. The insulating trench ($30{\times}30{\times}30cm$) packed with straw bundle installed under north wall might be effective to prevent heat loss of ground temperature. 7. In cloudy and snowy day, the temperature difference between indoor and outdoor were less, and the indoor temperature were maintained highly during night due to the effect of prevention of heat loss. 8. The highest temperature in a day was observed at about P.M. 3 both inside and outside of vinyl house and the lowest temperature was observed at about A.M. 6. The difference between the highest and lowest temperature of indoor in a sunny day was $19^{\circ}C$, compared with $9^{\circ}C$ on a cloudy or snowy day in late November. Especially, lowering of temperature in a snowy day was so less that the curve of temperature change was comparatively constant, 9. If the effective methods of citrus cultivation in vinyl house with improved clone such as hardiness. semi-dwarf and spur type are applied, it might be possible to cultivate the citrus tree safely in Jeju island as well as other southern rejion and to magnity the cultivation of citrus tree.

  • PDF

Professional Speciality of Communication Administration and, Occupational Group and Series Classes of Position in National Public Official Law -for Efficiency of Telecommunication Management- (통신행정의 전문성과 공무원법상 직군렬 - 전기통신의 관리들 중심으로-)

  • 조정현
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.3 no.1
    • /
    • pp.26-27
    • /
    • 1978
  • It can be expected that intelligence and knowledge will be the core of the post-industrial society in a near future. Accordingly, the age of intelligence shall be accelerated extensively to find ourselves in an age of 'Communication' service enterprise. The communication actions will increase its efficiency and multiply its utility, indebted to its scientic principles and legal idea. The two basic elements of communication action, that is, communication station and communication men are considered to perform their function when they are properly supported and managed by the government administration. Since the communication action itself is composed of various factors, the elements such as communication stations and officials must be cultivated and managed by specialist or experts with continuous and extensive study practices concerned. With the above mind, this study reviewed our public service officials law with a view to improve it by providing some suggestions for communication experts and researchers to find suitable positions in the framework of government administration. In this study, I would like to suggest 'Occupational Group of Communication' that is consisted of a series of comm, management positions and research positions in parallel to the existing series of comm, technical position. The communication specialist or expert is required to be qualified with necessary scientific knowledge and techniques of communication, as well as prerequisites as government service officials. Communication experts must succeed in the first hand to obtain government licence concerned in with the government law and regulation, and international custom before they can be appointed to the official positions. This system of licence-prior-to-appointment is principally applied in the communication management position. And communication research positions are for those who shall engage themselves to the work of study and research in the field of both management and technical nature. It is hopefully expected that efficient and extensive management of communication activities, as well as scientific and continuous study over than communication enterprise will be upgraded at national dimensions.

  • PDF

Customer Behavior Prediction of Binary Classification Model Using Unstructured Information and Convolution Neural Network: The Case of Online Storefront (비정형 정보와 CNN 기법을 활용한 이진 분류 모델의 고객 행태 예측: 전자상거래 사례를 중심으로)

  • Kim, Seungsoo;Kim, Jongwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.221-241
    • /
    • 2018
  • Deep learning is getting attention recently. The deep learning technique which had been applied in competitions of the International Conference on Image Recognition Technology(ILSVR) and AlphaGo is Convolution Neural Network(CNN). CNN is characterized in that the input image is divided into small sections to recognize the partial features and combine them to recognize as a whole. Deep learning technologies are expected to bring a lot of changes in our lives, but until now, its applications have been limited to image recognition and natural language processing. The use of deep learning techniques for business problems is still an early research stage. If their performance is proved, they can be applied to traditional business problems such as future marketing response prediction, fraud transaction detection, bankruptcy prediction, and so on. So, it is a very meaningful experiment to diagnose the possibility of solving business problems using deep learning technologies based on the case of online shopping companies which have big data, are relatively easy to identify customer behavior and has high utilization values. Especially, in online shopping companies, the competition environment is rapidly changing and becoming more intense. Therefore, analysis of customer behavior for maximizing profit is becoming more and more important for online shopping companies. In this study, we propose 'CNN model of Heterogeneous Information Integration' using CNN as a way to improve the predictive power of customer behavior in online shopping enterprises. In order to propose a model that optimizes the performance, which is a model that learns from the convolution neural network of the multi-layer perceptron structure by combining structured and unstructured information, this model uses 'heterogeneous information integration', 'unstructured information vector conversion', 'multi-layer perceptron design', and evaluate the performance of each architecture, and confirm the proposed model based on the results. In addition, the target variables for predicting customer behavior are defined as six binary classification problems: re-purchaser, churn, frequent shopper, frequent refund shopper, high amount shopper, high discount shopper. In order to verify the usefulness of the proposed model, we conducted experiments using actual data of domestic specific online shopping company. This experiment uses actual transactions, customers, and VOC data of specific online shopping company in Korea. Data extraction criteria are defined for 47,947 customers who registered at least one VOC in January 2011 (1 month). The customer profiles of these customers, as well as a total of 19 months of trading data from September 2010 to March 2012, and VOCs posted for a month are used. The experiment of this study is divided into two stages. In the first step, we evaluate three architectures that affect the performance of the proposed model and select optimal parameters. We evaluate the performance with the proposed model. Experimental results show that the proposed model, which combines both structured and unstructured information, is superior compared to NBC(Naïve Bayes classification), SVM(Support vector machine), and ANN(Artificial neural network). Therefore, it is significant that the use of unstructured information contributes to predict customer behavior, and that CNN can be applied to solve business problems as well as image recognition and natural language processing problems. It can be confirmed through experiments that CNN is more effective in understanding and interpreting the meaning of context in text VOC data. And it is significant that the empirical research based on the actual data of the e-commerce company can extract very meaningful information from the VOC data written in the text format directly by the customer in the prediction of the customer behavior. Finally, through various experiments, it is possible to say that the proposed model provides useful information for the future research related to the parameter selection and its performance.

Automatic Quality Evaluation with Completeness and Succinctness for Text Summarization (완전성과 간결성을 고려한 텍스트 요약 품질의 자동 평가 기법)

  • Ko, Eunjung;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.125-148
    • /
    • 2018
  • Recently, as the demand for big data analysis increases, cases of analyzing unstructured data and using the results are also increasing. Among the various types of unstructured data, text is used as a means of communicating information in almost all fields. In addition, many analysts are interested in the amount of data is very large and relatively easy to collect compared to other unstructured and structured data. Among the various text analysis applications, document classification which classifies documents into predetermined categories, topic modeling which extracts major topics from a large number of documents, sentimental analysis or opinion mining that identifies emotions or opinions contained in texts, and Text Summarization which summarize the main contents from one document or several documents have been actively studied. Especially, the text summarization technique is actively applied in the business through the news summary service, the privacy policy summary service, ect. In addition, much research has been done in academia in accordance with the extraction approach which provides the main elements of the document selectively and the abstraction approach which extracts the elements of the document and composes new sentences by combining them. However, the technique of evaluating the quality of automatically summarized documents has not made much progress compared to the technique of automatic text summarization. Most of existing studies dealing with the quality evaluation of summarization were carried out manual summarization of document, using them as reference documents, and measuring the similarity between the automatic summary and reference document. Specifically, automatic summarization is performed through various techniques from full text, and comparison with reference document, which is an ideal summary document, is performed for measuring the quality of automatic summarization. Reference documents are provided in two major ways, the most common way is manual summarization, in which a person creates an ideal summary by hand. Since this method requires human intervention in the process of preparing the summary, it takes a lot of time and cost to write the summary, and there is a limitation that the evaluation result may be different depending on the subject of the summarizer. Therefore, in order to overcome these limitations, attempts have been made to measure the quality of summary documents without human intervention. On the other hand, as a representative attempt to overcome these limitations, a method has been recently devised to reduce the size of the full text and to measure the similarity of the reduced full text and the automatic summary. In this method, the more frequent term in the full text appears in the summary, the better the quality of the summary. However, since summarization essentially means minimizing a lot of content while minimizing content omissions, it is unreasonable to say that a "good summary" based on only frequency always means a "good summary" in its essential meaning. In order to overcome the limitations of this previous study of summarization evaluation, this study proposes an automatic quality evaluation for text summarization method based on the essential meaning of summarization. Specifically, the concept of succinctness is defined as an element indicating how few duplicated contents among the sentences of the summary, and completeness is defined as an element that indicating how few of the contents are not included in the summary. In this paper, we propose a method for automatic quality evaluation of text summarization based on the concepts of succinctness and completeness. In order to evaluate the practical applicability of the proposed methodology, 29,671 sentences were extracted from TripAdvisor 's hotel reviews, summarized the reviews by each hotel and presented the results of the experiments conducted on evaluation of the quality of summaries in accordance to the proposed methodology. It also provides a way to integrate the completeness and succinctness in the trade-off relationship into the F-Score, and propose a method to perform the optimal summarization by changing the threshold of the sentence similarity.