• Title/Summary/Keyword: grid search

Search Result 271, Processing Time 0.023 seconds

Hybrid Machine Learning Model for Predicting the Direction of KOSPI Securities (코스피 방향 예측을 위한 하이브리드 머신러닝 모델)

  • Hwang, Heesoo
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.6
    • /
    • pp.9-16
    • /
    • 2021
  • In the past, there have been various studies on predicting the stock market by machine learning techniques using stock price data and financial big data. As stock index ETFs that can be traded through HTS and MTS are created, research on predicting stock indices has recently attracted attention. In this paper, machine learning models for KOSPI's up and down predictions are implemented separately. These models are optimized through a grid search of their control parameters. In addition, a hybrid machine learning model that combines individual models is proposed to improve the precision and increase the ETF trading return. The performance of the predictiion models is evaluated by the accuracy and the precision that determines the ETF trading return. The accuracy and precision of the hybrid up prediction model are 72.1 % and 63.8 %, and those of the down prediction model are 79.8% and 64.3%. The precision of the hybrid down prediction model is improved by at least 14.3 % and at most 20.5 %. The hybrid up and down prediction models show an ETF trading return of 10.49%, and 25.91%, respectively. Trading inverse×2 and leverage ETF can increase the return by 1.5 to 2 times. Further research on a down prediction machine learning model is expected to increase the rate of return.

Development of an Input File Preparation Tool for Offline Coupling of DNDC and DSSAT Models (DNDC 지역별 구동을 위한 입력자료 생성 도구 개발)

  • Hyun, Shinwoo;Hwang, Woosung;You, Heejin;Kim, Kwang Soo
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.23 no.1
    • /
    • pp.68-81
    • /
    • 2021
  • The agricultural ecosystem is one of the major sources of greenhouse gas (GHG) emissions. In order to search for climate change adaptation options which mitigate GHG emissions while maintaining crop yield, it is advantageous to integrate multiple models at a high spatial resolution. The objective of this study was to develop a tool to support integrated assessment of climate change impact b y coupling the DSSAT model and the DNDC model. DNDC Regional Input File Tool(DRIFT) was developed to prepare input data for the regional mode of DNDC model using input data and output data of the DSSAT model. In a case study, GHG emissions under the climate change conditions were simulated using the input data prepared b y the DRIFT. The time to prepare the input data was increased b y increasing the number of grid points. Most of the process took a relatively short time, while it took most of the time to convert the daily flood depth data of the DSSAT model to the flood period of the DNDC model. Still, processing a large amount of data would require a long time, which could be reduced by parallelizing some calculation processes. Expanding the DRIFT to other models would help reduce the time required to prepare input data for the models.

The Fault Diagnosis Model of Ship Fuel System Equipment Reflecting Time Dependency in Conv1D Algorithm Based on the Convolution Network (합성곱 네트워크 기반의 Conv1D 알고리즘에서 시간 종속성을 반영한 선박 연료계통 장비의 고장 진단 모델)

  • Kim, Hyung-Jin;Kim, Kwang-Sik;Hwang, Se-Yun;Lee, Jang Hyun
    • Journal of Navigation and Port Research
    • /
    • v.46 no.4
    • /
    • pp.367-374
    • /
    • 2022
  • The purpose of this study was to propose a deep learning algorithm that applies to the fault diagnosis of fuel pumps and purifiers of autonomous ships. A deep learning algorithm reflecting the time dependence of the measured signal was configured, and the failure pattern was trained using the vibration signal, measured in the equipment's regular operation and failure state. Considering the sequential time-dependence of deterioration implied in the vibration signal, this study adopts Conv1D with sliding window computation for fault detection. The time dependence was also reflected, by transferring the measured signal from two-dimensional to three-dimensional. Additionally, the optimal values of the hyper-parameters of the Conv1D model were determined, using the grid search technique. Finally, the results show that the proposed data preprocessing method as well as the Conv1D model, can reflect the sequential dependency between the fault and its effect on the measured signal, and appropriately perform anomaly as well as failure detection, of the equipment chosen for application.

Stakeholder Awareness of Rural Spatial Planning Data Utilization Based on Survey (농촌공간계획 데이터 수급에 대한 이해당사자 인식조사)

  • Zaewoong Rhee;Sang-Hyun Lee;Sungyun Lee;Jinsung Kim;Rui Qu;Seung-Jong Bae;Soo-Jin Kim;Sangbum Kim
    • Journal of Korean Society of Rural Planning
    • /
    • v.29 no.3
    • /
    • pp.25-37
    • /
    • 2023
  • According to the 「Rural Spatial Reconstruction and Regeneration Support Act」, enacted on March 29, 2024, all local governments are required to establish a 'Rural Spatial Reconstruction and Regeneration Plan' (hereinafter referred to as the 'Rural Spatial Plan'). In order for the 'Rural Spatial Plan' to be appropriately established, this study analyzed the supply and demand of spatial data from the perspective of user stakeholders and derived implications for improving rural spatial planning data utilization. In conclusion, three key recommendations come from this result. Firstly, it is necessary to establish an integrated DB for rural spatial planning data. This can solve the problem of low awareness of scattered data-providing websites, reduce the processing time of non-GIS data, and reduce the time required to acquire data by securing the availability of data search and download. In particular, research should be conducted on the establishment of a spatial analysis simulation system to support stakeholders' decision-making, considering that many stakeholders have difficulty in spatial analysis because spatial analysis techniques were not actively used in rural projects before the implementation of the rural agreement system in 2020. Secondly, research on how to improve data acquisition should be conducted in each data sector. The data sector group with the lowest ease of receiving are 'Local Community Domain', 'Changes in Domestic and International Conditions', and 'Provision and Utilization of Daily Life Services'. Lastly, in-depth research is needed on how to raise each rural spatial planning data supply stakeholder to the position of player. Stakeholders of 'University Institutions' and 'Public Enterprises and Research Institutes' should give those who participate in the formulation of rural spatial plans access to the raw data collected for public work. Stakeholders of 'Private company' need to come up with realistic measures to build a data pool centered on consultative bodies between existing private companies and then prepare a step-by-step strategy to fully open it by participating various stakeholders. In order to induce 'Village Residents and Associations' stakeholders to play a leading role as owners and producers of data, personnel should be trained to collect and record data related to the village. In addition, support measures should be prepared to continue these activities.

A Study on the Prediction of Uniaxial Compressive Strength Classification Using Slurry TBM Data and Random Forest (이수식 TBM 데이터와 랜덤포레스트를 이용한 일축압축강도 분류 예측에 관한 연구)

  • Tae-Ho Kang;Soon-Wook Choi;Chulho Lee;Soo-Ho Chang
    • Tunnel and Underground Space
    • /
    • v.33 no.6
    • /
    • pp.547-560
    • /
    • 2023
  • Recently, research on predicting ground classification using machine learning techniques, TBM excavation data, and ground data is increasing. In this study, a multi-classification prediction study for uniaxial compressive strength (UCS) was conducted by applying random forest model based on a decision tree among machine learning techniques widely used in various fields to machine data and ground data acquired at three slurry shield TBM sites. For the classification prediction, the training and test data were divided into 7:3, and a grid search including 5-fold cross-validation was used to select the optimal parameter. As a result of classification learning for UCS using a random forest, the accuracy of the multi-classification prediction model was found to be high at both 0.983 and 0.982 in the training set and the test set, respectively. However, due to the imbalance in data distribution between classes, the recall was evaluated low in class 4. It is judged that additional research is needed to increase the amount of measured data of UCS acquired in various sites.

Location Service Modeling of Distributed GIS for Replication Geospatial Information Object Management (중복 지리정보 객체 관리를 위한 분산 지리정보 시스템의 위치 서비스 모델링)

  • Jeong, Chang-Won;Lee, Won-Jung;Lee, Jae-Wan;Joo, Su-Chong
    • The KIPS Transactions:PartD
    • /
    • v.13D no.7 s.110
    • /
    • pp.985-996
    • /
    • 2006
  • As the internet technologies develop, the geographic information system environment is changing to the web-based service. Since geospatial information of the existing Web-GIS services were developed independently, there is no interoperability to support diverse map formats. In spite of the same geospatial information object it can be used for various proposes that is duplicated in GIS separately. It needs intelligent strategies for optimal replica selection, which is identification of replication geospatial information objects. And for management of replication objects, OMG, GLOBE and GRID computing suggested related frameworks. But these researches are not thorough going enough in case of geospatial information object. This paper presents a model of location service, which is supported for optimal selection among replication and management of replication objects. It is consist of tree main services. The first is binding service which can save names and properties of object defined by users according to service offers and enable clients to search them on the service of offers. The second is location service which can manage location information with contact records. And obtains performance information by the Load Sharing Facility on system independently with contact address. The third is intelligent selection service which can obtain basic/performance information from the binding service/location service and provide both faster access and better performance characteristics by rules as intelligent model based on rough sets. For the validity of location service model, this research presents the processes of location service execution with Graphic User Interface.

Prediction of a hit drama with a pattern analysis on early viewing ratings (초기 시청시간 패턴 분석을 통한 대흥행 드라마 예측)

  • Nam, Kihwan;Seong, Nohyoon
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.33-49
    • /
    • 2018
  • The impact of TV Drama success on TV Rating and the channel promotion effectiveness is very high. The cultural and business impact has been also demonstrated through the Korean Wave. Therefore, the early prediction of the blockbuster success of TV Drama is very important from the strategic perspective of the media industry. Previous studies have tried to predict the audience ratings and success of drama based on various methods. However, most of the studies have made simple predictions using intuitive methods such as the main actor and time zone. These studies have limitations in predicting. In this study, we propose a model for predicting the popularity of drama by analyzing the customer's viewing pattern based on various theories. This is not only a theoretical contribution but also has a contribution from the practical point of view that can be used in actual broadcasting companies. In this study, we collected data of 280 TV mini-series dramas, broadcasted over the terrestrial channels for 10 years from 2003 to 2012. From the data, we selected the most highly ranked and the least highly ranked 45 TV drama and analyzed the viewing patterns of them by 11-step. The various assumptions and conditions for modeling are based on existing studies, or by the opinions of actual broadcasters and by data mining techniques. Then, we developed a prediction model by measuring the viewing-time distance (difference) using Euclidean and Correlation method, which is termed in our study similarity (the sum of distance). Through the similarity measure, we predicted the success of dramas from the viewer's initial viewing-time pattern distribution using 1~5 episodes. In order to confirm that the model is shaken according to the measurement method, various distance measurement methods were applied and the model was checked for its dryness. And when the model was established, we could make a more predictive model using a grid search. Furthermore, we classified the viewers who had watched TV drama more than 70% of the total airtime as the "passionate viewer" when a new drama is broadcasted. Then we compared the drama's passionate viewer percentage the most highly ranked and the least highly ranked dramas. So that we can determine the possibility of blockbuster TV mini-series. We find that the initial viewing-time pattern is the key factor for the prediction of blockbuster dramas. From our model, block-buster dramas were correctly classified with the 75.47% accuracy with the initial viewing-time pattern analysis. This paper shows high prediction rate while suggesting audience rating method different from existing ones. Currently, broadcasters rely heavily on some famous actors called so-called star systems, so they are in more severe competition than ever due to rising production costs of broadcasting programs, long-term recession, aggressive investment in comprehensive programming channels and large corporations. Everyone is in a financially difficult situation. The basic revenue model of these broadcasters is advertising, and the execution of advertising is based on audience rating as a basic index. In the drama, there is uncertainty in the drama market that it is difficult to forecast the demand due to the nature of the commodity, while the drama market has a high financial contribution in the success of various contents of the broadcasting company. Therefore, to minimize the risk of failure. Thus, by analyzing the distribution of the first-time viewing time, it can be a practical help to establish a response strategy (organization/ marketing/story change, etc.) of the related company. Also, in this paper, we found that the behavior of the audience is crucial to the success of the program. In this paper, we define TV viewing as a measure of how enthusiastically watching TV is watched. We can predict the success of the program successfully by calculating the loyalty of the customer with the hot blood. This way of calculating loyalty can also be used to calculate loyalty to various platforms. It can also be used for marketing programs such as highlights, script previews, making movies, characters, games, and other marketing projects.

Steel Plate Faults Diagnosis with S-MTS (S-MTS를 이용한 강판의 표면 결함 진단)

  • Kim, Joon-Young;Cha, Jae-Min;Shin, Junguk;Yeom, Choongsub
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.1
    • /
    • pp.47-67
    • /
    • 2017
  • Steel plate faults is one of important factors to affect the quality and price of the steel plates. So far many steelmakers generally have used visual inspection method that could be based on an inspector's intuition or experience. Specifically, the inspector checks the steel plate faults by looking the surface of the steel plates. However, the accuracy of this method is critically low that it can cause errors above 30% in judgment. Therefore, accurate steel plate faults diagnosis system has been continuously required in the industry. In order to meet the needs, this study proposed a new steel plate faults diagnosis system using Simultaneous MTS (S-MTS), which is an advanced Mahalanobis Taguchi System (MTS) algorithm, to classify various surface defects of the steel plates. MTS has generally been used to solve binary classification problems in various fields, but MTS was not used for multiclass classification due to its low accuracy. The reason is that only one mahalanobis space is established in the MTS. In contrast, S-MTS is suitable for multi-class classification. That is, S-MTS establishes individual mahalanobis space for each class. 'Simultaneous' implies comparing mahalanobis distances at the same time. The proposed steel plate faults diagnosis system was developed in four main stages. In the first stage, after various reference groups and related variables are defined, data of the steel plate faults is collected and used to establish the individual mahalanobis space per the reference groups and construct the full measurement scale. In the second stage, the mahalanobis distances of test groups is calculated based on the established mahalanobis spaces of the reference groups. Then, appropriateness of the spaces is verified by examining the separability of the mahalanobis diatances. In the third stage, orthogonal arrays and Signal-to-Noise (SN) ratio of dynamic type are applied for variable optimization. Also, Overall SN ratio gain is derived from the SN ratio and SN ratio gain. If the derived overall SN ratio gain is negative, it means that the variable should be removed. However, the variable with the positive gain may be considered as worth keeping. Finally, in the fourth stage, the measurement scale that is composed of selected useful variables is reconstructed. Next, an experimental test should be implemented to verify the ability of multi-class classification and thus the accuracy of the classification is acquired. If the accuracy is acceptable, this diagnosis system can be used for future applications. Also, this study compared the accuracy of the proposed steel plate faults diagnosis system with that of other popular classification algorithms including Decision Tree, Multi Perception Neural Network (MLPNN), Logistic Regression (LR), Support Vector Machine (SVM), Tree Bagger Random Forest, Grid Search (GS), Genetic Algorithm (GA) and Particle Swarm Optimization (PSO). The steel plates faults dataset used in the study is taken from the University of California at Irvine (UCI) machine learning repository. As a result, the proposed steel plate faults diagnosis system based on S-MTS shows 90.79% of classification accuracy. The accuracy of the proposed diagnosis system is 6-27% higher than MLPNN, LR, GS, GA and PSO. Based on the fact that the accuracy of commercial systems is only about 75-80%, it means that the proposed system has enough classification performance to be applied in the industry. In addition, the proposed system can reduce the number of measurement sensors that are installed in the fields because of variable optimization process. These results show that the proposed system not only can have a good ability on the steel plate faults diagnosis but also reduce operation and maintenance cost. For our future work, it will be applied in the fields to validate actual effectiveness of the proposed system and plan to improve the accuracy based on the results.

Interpreting Bounded Rationality in Business and Industrial Marketing Contexts: Executive Training Case Studies (집행관배훈안례연구(阐述工商业背景下的有限合理性):집행관배훈안례연구(执行官培训案例研究))

  • Woodside, Arch G.;Lai, Wen-Hsiang;Kim, Kyung-Hoon;Jung, Deuk-Keyo
    • Journal of Global Scholars of Marketing Science
    • /
    • v.19 no.3
    • /
    • pp.49-61
    • /
    • 2009
  • This article provides training exercises for executives into interpreting subroutine maps of executives' thinking in processing business and industrial marketing problems and opportunities. This study builds on premises that Schank proposes about learning and teaching including (1) learning occurs by experiencing and the best instruction offers learners opportunities to distill their knowledge and skills from interactive stories in the form of goal.based scenarios, team projects, and understanding stories from experts. Also, (2) telling does not lead to learning because learning requires action-training environments should emphasize active engagement with stories, cases, and projects. Each training case study includes executive exposure to decision system analysis (DSA). The training case requires the executive to write a "Briefing Report" of a DSA map. Instructions to the executive trainee in writing the briefing report include coverage in the briefing report of (1) details of the essence of the DSA map and (2) a statement of warnings and opportunities that the executive map reader interprets within the DSA map. The length maximum for a briefing report is 500 words-an arbitrary rule that works well in executive training programs. Following this introduction, section two of the article briefly summarizes relevant literature on how humans think within contexts in response to problems and opportunities. Section three illustrates the creation and interpreting of DSA maps using a training exercise in pricing a chemical product to different OEM (original equipment manufacturer) customers. Section four presents a training exercise in pricing decisions by a petroleum manufacturing firm. Section five presents a training exercise in marketing strategies by an office furniture distributer along with buying strategies by business customers. Each of the three training exercises is based on research into information processing and decision making of executives operating in marketing contexts. Section six concludes the article with suggestions for use of this training case and for developing additional training cases for honing executives' decision-making skills. Todd and Gigerenzer propose that humans use simple heuristics because they enable adaptive behavior by exploiting the structure of information in natural decision environments. "Simplicity is a virtue, rather than a curse". Bounded rationality theorists emphasize the centrality of Simon's proposition, "Human rational behavior is shaped by a scissors whose blades are the structure of the task environments and the computational capabilities of the actor". Gigerenzer's view is relevant to Simon's environmental blade and to the environmental structures in the three cases in this article, "The term environment, here, does not refer to a description of the total physical and biological environment, but only to that part important to an organism, given its needs and goals." The present article directs attention to research that combines reports on the structure of task environments with the use of adaptive toolbox heuristics of actors. The DSA mapping approach here concerns the match between strategy and an environment-the development and understanding of ecological rationality theory. Aspiration adaptation theory is central to this approach. Aspiration adaptation theory models decision making as a multi-goal problem without aggregation of the goals into a complete preference order over all decision alternatives. The three case studies in this article permit the learner to apply propositions in aspiration level rules in reaching a decision. Aspiration adaptation takes the form of a sequence of adjustment steps. An adjustment step shifts the current aspiration level to a neighboring point on an aspiration grid by a change in only one goal variable. An upward adjustment step is an increase and a downward adjustment step is a decrease of a goal variable. Creating and using aspiration adaptation levels is integral to bounded rationality theory. The present article increases understanding and expertise of both aspiration adaptation and bounded rationality theories by providing learner experiences and practice in using propositions in both theories. Practice in ranking CTSs and writing TOP gists from DSA maps serves to clarify and deepen Selten's view, "Clearly, aspiration adaptation must enter the picture as an integrated part of the search for a solution." The body of "direct research" by Mintzberg, Gladwin's ethnographic decision tree modeling, and Huff's work on mapping strategic thought are suggestions on where to look for research that considers both the structure of the environment and the computational capabilities of the actors making decisions in these environments. Such research on bounded rationality permits both further development of theory in how and why decisions are made in real life and the development of learning exercises in the use of heuristics occurring in natural environments. The exercises in the present article encourage learning skills and principles of using fast and frugal heuristics in contexts of their intended use. The exercises respond to Schank's wisdom, "In a deep sense, education isn't about knowledge or getting students to know what has happened. It is about getting them to feel what has happened. This is not easy to do. Education, as it is in schools today, is emotionless. This is a huge problem." The three cases and accompanying set of exercise questions adhere to Schank's view, "Processes are best taught by actually engaging in them, which can often mean, for mental processing, active discussion."

  • PDF

Application of Support Vector Regression for Improving the Performance of the Emotion Prediction Model (감정예측모형의 성과개선을 위한 Support Vector Regression 응용)

  • Kim, Seongjin;Ryoo, Eunchung;Jung, Min Kyu;Kim, Jae Kyeong;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.3
    • /
    • pp.185-202
    • /
    • 2012
  • .Since the value of information has been realized in the information society, the usage and collection of information has become important. A facial expression that contains thousands of information as an artistic painting can be described in thousands of words. Followed by the idea, there has recently been a number of attempts to provide customers and companies with an intelligent service, which enables the perception of human emotions through one's facial expressions. For example, MIT Media Lab, the leading organization in this research area, has developed the human emotion prediction model, and has applied their studies to the commercial business. In the academic area, a number of the conventional methods such as Multiple Regression Analysis (MRA) or Artificial Neural Networks (ANN) have been applied to predict human emotion in prior studies. However, MRA is generally criticized because of its low prediction accuracy. This is inevitable since MRA can only explain the linear relationship between the dependent variables and the independent variable. To mitigate the limitations of MRA, some studies like Jung and Kim (2012) have used ANN as the alternative, and they reported that ANN generated more accurate prediction than the statistical methods like MRA. However, it has also been criticized due to over fitting and the difficulty of the network design (e.g. setting the number of the layers and the number of the nodes in the hidden layers). Under this background, we propose a novel model using Support Vector Regression (SVR) in order to increase the prediction accuracy. SVR is an extensive version of Support Vector Machine (SVM) designated to solve the regression problems. The model produced by SVR only depends on a subset of the training data, because the cost function for building the model ignores any training data that is close (within a threshold ${\varepsilon}$) to the model prediction. Using SVR, we tried to build a model that can measure the level of arousal and valence from the facial features. To validate the usefulness of the proposed model, we collected the data of facial reactions when providing appropriate visual stimulating contents, and extracted the features from the data. Next, the steps of the preprocessing were taken to choose statistically significant variables. In total, 297 cases were used for the experiment. As the comparative models, we also applied MRA and ANN to the same data set. For SVR, we adopted '${\varepsilon}$-insensitive loss function', and 'grid search' technique to find the optimal values of the parameters like C, d, ${\sigma}^2$, and ${\varepsilon}$. In the case of ANN, we adopted a standard three-layer backpropagation network, which has a single hidden layer. The learning rate and momentum rate of ANN were set to 10%, and we used sigmoid function as the transfer function of hidden and output nodes. We performed the experiments repeatedly by varying the number of nodes in the hidden layer to n/2, n, 3n/2, and 2n, where n is the number of the input variables. The stopping condition for ANN was set to 50,000 learning events. And, we used MAE (Mean Absolute Error) as the measure for performance comparison. From the experiment, we found that SVR achieved the highest prediction accuracy for the hold-out data set compared to MRA and ANN. Regardless of the target variables (the level of arousal, or the level of positive / negative valence), SVR showed the best performance for the hold-out data set. ANN also outperformed MRA, however, it showed the considerably lower prediction accuracy than SVR for both target variables. The findings of our research are expected to be useful to the researchers or practitioners who are willing to build the models for recognizing human emotions.