• Title/Summary/Keyword: 설계 검증

Search Result 10,186, Processing Time 0.037 seconds

A Study on the Prediction Model of Stock Price Index Trend based on GA-MSVM that Simultaneously Optimizes Feature and Instance Selection (입력변수 및 학습사례 선정을 동시에 최적화하는 GA-MSVM 기반 주가지수 추세 예측 모형에 관한 연구)

  • Lee, Jong-sik;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.4
    • /
    • pp.147-168
    • /
    • 2017
  • There have been many studies on accurate stock market forecasting in academia for a long time, and now there are also various forecasting models using various techniques. Recently, many attempts have been made to predict the stock index using various machine learning methods including Deep Learning. Although the fundamental analysis and the technical analysis method are used for the analysis of the traditional stock investment transaction, the technical analysis method is more useful for the application of the short-term transaction prediction or statistical and mathematical techniques. Most of the studies that have been conducted using these technical indicators have studied the model of predicting stock prices by binary classification - rising or falling - of stock market fluctuations in the future market (usually next trading day). However, it is also true that this binary classification has many unfavorable aspects in predicting trends, identifying trading signals, or signaling portfolio rebalancing. In this study, we try to predict the stock index by expanding the stock index trend (upward trend, boxed, downward trend) to the multiple classification system in the existing binary index method. In order to solve this multi-classification problem, a technique such as Multinomial Logistic Regression Analysis (MLOGIT), Multiple Discriminant Analysis (MDA) or Artificial Neural Networks (ANN) we propose an optimization model using Genetic Algorithm as a wrapper for improving the performance of this model using Multi-classification Support Vector Machines (MSVM), which has proved to be superior in prediction performance. In particular, the proposed model named GA-MSVM is designed to maximize model performance by optimizing not only the kernel function parameters of MSVM, but also the optimal selection of input variables (feature selection) as well as instance selection. In order to verify the performance of the proposed model, we applied the proposed method to the real data. The results show that the proposed method is more effective than the conventional multivariate SVM, which has been known to show the best prediction performance up to now, as well as existing artificial intelligence / data mining techniques such as MDA, MLOGIT, CBR, and it is confirmed that the prediction performance is better than this. Especially, it has been confirmed that the 'instance selection' plays a very important role in predicting the stock index trend, and it is confirmed that the improvement effect of the model is more important than other factors. To verify the usefulness of GA-MSVM, we applied it to Korea's real KOSPI200 stock index trend forecast. Our research is primarily aimed at predicting trend segments to capture signal acquisition or short-term trend transition points. The experimental data set includes technical indicators such as the price and volatility index (2004 ~ 2017) and macroeconomic data (interest rate, exchange rate, S&P 500, etc.) of KOSPI200 stock index in Korea. Using a variety of statistical methods including one-way ANOVA and stepwise MDA, 15 indicators were selected as candidate independent variables. The dependent variable, trend classification, was classified into three states: 1 (upward trend), 0 (boxed), and -1 (downward trend). 70% of the total data for each class was used for training and the remaining 30% was used for verifying. To verify the performance of the proposed model, several comparative model experiments such as MDA, MLOGIT, CBR, ANN and MSVM were conducted. MSVM has adopted the One-Against-One (OAO) approach, which is known as the most accurate approach among the various MSVM approaches. Although there are some limitations, the final experimental results demonstrate that the proposed model, GA-MSVM, performs at a significantly higher level than all comparative models.

Application of Support Vector Regression for Improving the Performance of the Emotion Prediction Model (감정예측모형의 성과개선을 위한 Support Vector Regression 응용)

  • Kim, Seongjin;Ryoo, Eunchung;Jung, Min Kyu;Kim, Jae Kyeong;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.3
    • /
    • pp.185-202
    • /
    • 2012
  • .Since the value of information has been realized in the information society, the usage and collection of information has become important. A facial expression that contains thousands of information as an artistic painting can be described in thousands of words. Followed by the idea, there has recently been a number of attempts to provide customers and companies with an intelligent service, which enables the perception of human emotions through one's facial expressions. For example, MIT Media Lab, the leading organization in this research area, has developed the human emotion prediction model, and has applied their studies to the commercial business. In the academic area, a number of the conventional methods such as Multiple Regression Analysis (MRA) or Artificial Neural Networks (ANN) have been applied to predict human emotion in prior studies. However, MRA is generally criticized because of its low prediction accuracy. This is inevitable since MRA can only explain the linear relationship between the dependent variables and the independent variable. To mitigate the limitations of MRA, some studies like Jung and Kim (2012) have used ANN as the alternative, and they reported that ANN generated more accurate prediction than the statistical methods like MRA. However, it has also been criticized due to over fitting and the difficulty of the network design (e.g. setting the number of the layers and the number of the nodes in the hidden layers). Under this background, we propose a novel model using Support Vector Regression (SVR) in order to increase the prediction accuracy. SVR is an extensive version of Support Vector Machine (SVM) designated to solve the regression problems. The model produced by SVR only depends on a subset of the training data, because the cost function for building the model ignores any training data that is close (within a threshold ${\varepsilon}$) to the model prediction. Using SVR, we tried to build a model that can measure the level of arousal and valence from the facial features. To validate the usefulness of the proposed model, we collected the data of facial reactions when providing appropriate visual stimulating contents, and extracted the features from the data. Next, the steps of the preprocessing were taken to choose statistically significant variables. In total, 297 cases were used for the experiment. As the comparative models, we also applied MRA and ANN to the same data set. For SVR, we adopted '${\varepsilon}$-insensitive loss function', and 'grid search' technique to find the optimal values of the parameters like C, d, ${\sigma}^2$, and ${\varepsilon}$. In the case of ANN, we adopted a standard three-layer backpropagation network, which has a single hidden layer. The learning rate and momentum rate of ANN were set to 10%, and we used sigmoid function as the transfer function of hidden and output nodes. We performed the experiments repeatedly by varying the number of nodes in the hidden layer to n/2, n, 3n/2, and 2n, where n is the number of the input variables. The stopping condition for ANN was set to 50,000 learning events. And, we used MAE (Mean Absolute Error) as the measure for performance comparison. From the experiment, we found that SVR achieved the highest prediction accuracy for the hold-out data set compared to MRA and ANN. Regardless of the target variables (the level of arousal, or the level of positive / negative valence), SVR showed the best performance for the hold-out data set. ANN also outperformed MRA, however, it showed the considerably lower prediction accuracy than SVR for both target variables. The findings of our research are expected to be useful to the researchers or practitioners who are willing to build the models for recognizing human emotions.

Analyzing the User Intention of Booth Recommender System in Smart Exhibition Environment (스마트 전시환경에서 부스 추천시스템의 사용자 의도에 관한 조사연구)

  • Choi, Jae Ho;Xiang, Jun-Yong;Moon, Hyun Sil;Choi, Il Young;Kim, Jae Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.3
    • /
    • pp.153-169
    • /
    • 2012
  • Exhibitions have played a key role of effective marketing activity which directly informs services and products to current and potential customers. Through participating in exhibitions, exhibitors have got the opportunity to make face-to-face contact so that they can secure the market share and improve their corporate images. According to this economic importance of exhibitions, show organizers try to adopt a new IT technology for improving their performance, and researchers have also studied services which can improve the satisfaction of visitors through analyzing visit patterns of visitors. Especially, as smart technologies make them monitor activities of visitors in real-time, they have considered booth recommender systems which infer preference of visitors and recommender proper service to them like on-line environment. However, while there are many studies which can improve their performance in the side of new technological development, they have not considered the choice factor of visitors for booth recommender systems. That is, studies for factors which can influence the development direction and effective diffusion of these systems are insufficient. Most of prior studies for the acceptance of new technologies and the continuous intention of use have adopted Technology Acceptance Model (TAM) and Extended Technology Acceptance Model (ETAM). Booth recommender systems may not be new technology because they are similar with commercial recommender systems such as book recommender systems, in the smart exhibition environment, they can be considered new technology. However, for considering the smart exhibition environment beyond TAM, measurements for the intention of reuse should focus on how booth recommender systems can provide correct information to visitors. In this study, through literature reviews, we draw factors which can influence the satisfaction and reuse intention of visitors for booth recommender systems, and design a model to forecast adaptation of visitors for booth recommendation in the exhibition environment. For these purposes, we conduct a survey for visitors who attended DMC Culture Open in November 2011 and experienced booth recommender systems using own smart phone, and examine hypothesis by regression analysis. As a result, factors which can influence the satisfaction of visitors for booth recommender systems are the effectiveness, perceived ease of use, argument quality, serendipity, and so on. Moreover, the satisfaction for booth recommender systems has a positive relationship with the development of reuse intention. For these results, we have some insights for booth recommender systems in the smart exhibition environment. First, this study gives shape to important factors which are considered when they establish strategies which induce visitors to consistently use booth recommender systems. Recently, although show organizers try to improve their performances using new IT technologies, their visitors have not felt the satisfaction from these efforts. At this point, this study can help them to provide services which can improve the satisfaction of visitors and make them last relationship with visitors. On the other hands, this study suggests that they managers along the using time of booth recommender systems. For example, in the early stage of the adoption, they should focus on the argument quality, perceived ease of use, and serendipity, so that improve the acceptance of booth recommender systems. After these stages, they should bridge the differences between expectation and perception for booth recommender systems, and lead continuous uses of visitors. However, this study has some limitations. We only use four factors which can influence the satisfaction of visitors. Therefore, we should development our model to consider important additional factors. And the exhibition in our experiments has small number of booths so that visitors may not need to booth recommender systems. In the future study, we will conduct experiments in the exhibition environment which has a larger scale.

A Thermal Time-Driven Dormancy Index as a Complementary Criterion for Grape Vine Freeze Risk Evaluation (포도 동해위험 판정기준으로서 온도시간 기반의 휴면심도 이용)

  • Kwon, Eun-Young;Jung, Jea-Eun;Chung, U-Ran;Lee, Seung-Jong;Song, Gi-Cheol;Choi, Dong-Geun;Yun, Jin-I.
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.8 no.1
    • /
    • pp.1-9
    • /
    • 2006
  • Regardless of the recent observed warmer winters in Korea, more freeze injuries and associated economic losses are reported in fruit industry than ever before. Existing freeze-frost forecasting systems employ only daily minimum temperature for judging the potential damage on dormant flowering buds but cannot accommodate potential biological responses such as short-term acclimation of plants to severe weather episodes as well as annual variation in climate. We introduce 'dormancy depth', in addition to daily minimum temperature, as a complementary criterion for judging the potential damage of freezing temperatures on dormant flowering buds of grape vines. Dormancy depth can be estimated by a phonology model driven by daily maximum and minimum temperature and is expected to make a reasonable proxy for physiological tolerance of buds to low temperature. Dormancy depth at a selected site was estimated for a climatological normal year by this model, and we found a close similarity in time course change pattern between the estimated dormancy depth and the known cold tolerance of fruit trees. Inter-annual and spatial variation in dormancy depth were identified by this method, showing the feasibility of using dormancy depth as a proxy indicator for tolerance to low temperature during the winter season. The model was applied to 10 vineyards which were recently damaged by a cold spell, and a temperature-dormancy depth-freeze injury relationship was formulated into an exponential-saturation model which can be used for judging freeze risk under a given set of temperature and dormancy depth. Based on this model and the expected lowest temperature with a 10-year recurrence interval, a freeze risk probability map was produced for Hwaseong County, Korea. The results seemed to explain why the vineyards in the warmer part of Hwaseong County have been hit by more freeBe damage than those in the cooler part of the county. A dormancy depth-minimum temperature dual engine freeze warning system was designed for vineyards in major production counties in Korea by combining the site-specific dormancy depth and minimum temperature forecasts with the freeze risk model. In this system, daily accumulation of thermal time since last fall leads to the dormancy state (depth) for today. The regional minimum temperature forecast for tomorrow by the Korea Meteorological Administration is converted to the site specific forecast at a 30m resolution. These data are input to the freeze risk model and the percent damage probability is calculated for each grid cell and mapped for the entire county. Similar approaches may be used to develop freeze warning systems for other deciduous fruit trees.

Design and Implementation of MongoDB-based Unstructured Log Processing System over Cloud Computing Environment (클라우드 환경에서 MongoDB 기반의 비정형 로그 처리 시스템 설계 및 구현)

  • Kim, Myoungjin;Han, Seungho;Cui, Yun;Lee, Hanku
    • Journal of Internet Computing and Services
    • /
    • v.14 no.6
    • /
    • pp.71-84
    • /
    • 2013
  • Log data, which record the multitude of information created when operating computer systems, are utilized in many processes, from carrying out computer system inspection and process optimization to providing customized user optimization. In this paper, we propose a MongoDB-based unstructured log processing system in a cloud environment for processing the massive amount of log data of banks. Most of the log data generated during banking operations come from handling a client's business. Therefore, in order to gather, store, categorize, and analyze the log data generated while processing the client's business, a separate log data processing system needs to be established. However, the realization of flexible storage expansion functions for processing a massive amount of unstructured log data and executing a considerable number of functions to categorize and analyze the stored unstructured log data is difficult in existing computer environments. Thus, in this study, we use cloud computing technology to realize a cloud-based log data processing system for processing unstructured log data that are difficult to process using the existing computing infrastructure's analysis tools and management system. The proposed system uses the IaaS (Infrastructure as a Service) cloud environment to provide a flexible expansion of computing resources and includes the ability to flexibly expand resources such as storage space and memory under conditions such as extended storage or rapid increase in log data. Moreover, to overcome the processing limits of the existing analysis tool when a real-time analysis of the aggregated unstructured log data is required, the proposed system includes a Hadoop-based analysis module for quick and reliable parallel-distributed processing of the massive amount of log data. Furthermore, because the HDFS (Hadoop Distributed File System) stores data by generating copies of the block units of the aggregated log data, the proposed system offers automatic restore functions for the system to continually operate after it recovers from a malfunction. Finally, by establishing a distributed database using the NoSQL-based Mongo DB, the proposed system provides methods of effectively processing unstructured log data. Relational databases such as the MySQL databases have complex schemas that are inappropriate for processing unstructured log data. Further, strict schemas like those of relational databases cannot expand nodes in the case wherein the stored data are distributed to various nodes when the amount of data rapidly increases. NoSQL does not provide the complex computations that relational databases may provide but can easily expand the database through node dispersion when the amount of data increases rapidly; it is a non-relational database with an appropriate structure for processing unstructured data. The data models of the NoSQL are usually classified as Key-Value, column-oriented, and document-oriented types. Of these, the representative document-oriented data model, MongoDB, which has a free schema structure, is used in the proposed system. MongoDB is introduced to the proposed system because it makes it easy to process unstructured log data through a flexible schema structure, facilitates flexible node expansion when the amount of data is rapidly increasing, and provides an Auto-Sharding function that automatically expands storage. The proposed system is composed of a log collector module, a log graph generator module, a MongoDB module, a Hadoop-based analysis module, and a MySQL module. When the log data generated over the entire client business process of each bank are sent to the cloud server, the log collector module collects and classifies data according to the type of log data and distributes it to the MongoDB module and the MySQL module. The log graph generator module generates the results of the log analysis of the MongoDB module, Hadoop-based analysis module, and the MySQL module per analysis time and type of the aggregated log data, and provides them to the user through a web interface. Log data that require a real-time log data analysis are stored in the MySQL module and provided real-time by the log graph generator module. The aggregated log data per unit time are stored in the MongoDB module and plotted in a graph according to the user's various analysis conditions. The aggregated log data in the MongoDB module are parallel-distributed and processed by the Hadoop-based analysis module. A comparative evaluation is carried out against a log data processing system that uses only MySQL for inserting log data and estimating query performance; this evaluation proves the proposed system's superiority. Moreover, an optimal chunk size is confirmed through the log data insert performance evaluation of MongoDB for various chunk sizes.

A Study on the Effect of Booth Recommendation System on Exhibition Visitors Unplanned Visit Behavior (전시장 참관객의 계획되지 않은 방문행동에 있어서 부스추천시스템의 영향에 대한 연구)

  • Chung, Nam-Ho;Kim, Jae-Kyung
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.4
    • /
    • pp.175-191
    • /
    • 2011
  • With the MICE(Meeting, Incentive travel, Convention, Exhibition) industry coming into the spotlight, there has been a growing interest in the domestic exhibition industry. Accordingly, in Korea, various studies of the industry are being conducted to enhance exhibition performance as in the United States or Europe. Some studies are focusing particularly on analyzing visiting patterns of exhibition visitors using intelligent information technology in consideration of the variations in effects of watching exhibitions according to the exhibitory environment or technique, thereby understanding visitors and, furthermore, drawing the correlations between exhibiting businesses and improving exhibition performance. However, previous studies related to booth recommendation systems only discussed the accuracy of recommendation in the aspect of a system rather than determining changes in visitors' behavior or perception by recommendation. A booth recommendation system enables visitors to visit unplanned exhibition booths by recommending visitors suitable ones based on information about visitors' visits. Meanwhile, some visitors may be satisfied with their unplanned visits, while others may consider the recommending process to be cumbersome or obstructive to their free observation. In the latter case, the exhibition is likely to produce worse results compared to when visitors are allowed to freely observe the exhibition. Thus, in order to apply a booth recommendation system to exhibition halls, the factors affecting the performance of the system should be generally examined, and the effects of the system on visitors' unplanned visiting behavior should be carefully studied. As such, this study aims to determine the factors that affect the performance of a booth recommendation system by reviewing theories and literature and to examine the effects of visitors' perceived performance of the system on their satisfaction of unplanned behavior and intention to reuse the system. Toward this end, the unplanned behavior theory was adopted as the theoretical framework. Unplanned behavior can be defined as "behavior that is done by consumers without any prearranged plan". Thus far, consumers' unplanned behavior has been studied in various fields. The field of marketing, in particular, has focused on unplanned purchasing among various types of unplanned behavior, which has been often confused with impulsive purchasing. Nevertheless, the two are different from each other; while impulsive purchasing means strong, continuous urges to purchase things, unplanned purchasing is behavior with purchasing decisions that are made inside a store, not before going into one. In other words, all impulsive purchases are unplanned, but not all unplanned purchases are impulsive. Then why do consumers engage in unplanned behavior? Regarding this question, many scholars have made many suggestions, but there has been a consensus that it is because consumers have enough flexibility to change their plans in the middle instead of developing plans thoroughly. In other words, if unplanned behavior costs much, it will be difficult for consumers to change their prearranged plans. In the case of the exhibition hall examined in this study, visitors learn the programs of the hall and plan which booth to visit in advance. This is because it is practically impossible for visitors to visit all of the various booths that an exhibition operates due to their limited time. Therefore, if the booth recommendation system proposed in this study recommends visitors booths that they may like, they can change their plans and visit the recommended booths. Such visiting behavior can be regarded similarly to consumers' visit to a store or tourists' unplanned behavior in a tourist spot and can be understand in the same context as the recent increase in tourism consumers' unplanned behavior influenced by information devices. Thus, the following research model was established. This research model uses visitors' perceived performance of a booth recommendation system as the parameter, and the factors affecting the performance include trust in the system, exhibition visitors' knowledge levels, expected personalization of the system, and the system's threat to freedom. In addition, the causal relation between visitors' satisfaction of their perceived performance of the system and unplanned behavior and their intention to reuse the system was determined. While doing so, trust in the booth recommendation system consisted of 2nd order factors such as competence, benevolence, and integrity, while the other factors consisted of 1st order factors. In order to verify this model, a booth recommendation system was developed to be tested in 2011 DMC Culture Open, and 101 visitors were empirically studied and analyzed. The results are as follows. First, visitors' trust was the most important factor in the booth recommendation system, and the visitors who used the system perceived its performance as a success based on their trust. Second, visitors' knowledge levels also had significant effects on the performance of the system, which indicates that the performance of a recommendation system requires an advance understanding. In other words, visitors with higher levels of understanding of the exhibition hall learned better the usefulness of the booth recommendation system. Third, expected personalization did not have significant effects, which is a different result from previous studies' results. This is presumably because the booth recommendation system used in this study did not provide enough personalized services. Fourth, the recommendation information provided by the booth recommendation system was not considered to threaten or restrict one's freedom, which means it is valuable in terms of usefulness. Lastly, high performance of the booth recommendation system led to visitors' high satisfaction levels of unplanned behavior and intention to reuse the system. To sum up, in order to analyze the effects of a booth recommendation system on visitors' unplanned visits to a booth, empirical data were examined based on the unplanned behavior theory and, accordingly, useful suggestions for the establishment and design of future booth recommendation systems were made. In the future, further examination should be conducted through elaborate survey questions and survey objects.