• Title/Summary/Keyword: Genetic Based Machine Learning

Search Result 113, Processing Time 0.022 seconds

The Effect of Data Size on the k-NN Predictability: Application to Samsung Electronics Stock Market Prediction (데이터 크기에 따른 k-NN의 예측력 연구: 삼성전자주가를 사례로)

  • Chun, Se-Hak
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.239-251
    • /
    • 2019
  • Statistical methods such as moving averages, Kalman filtering, exponential smoothing, regression analysis, and ARIMA (autoregressive integrated moving average) have been used for stock market predictions. However, these statistical methods have not produced superior performances. In recent years, machine learning techniques have been widely used in stock market predictions, including artificial neural network, SVM, and genetic algorithm. In particular, a case-based reasoning method, known as k-nearest neighbor is also widely used for stock price prediction. Case based reasoning retrieves several similar cases from previous cases when a new problem occurs, and combines the class labels of similar cases to create a classification for the new problem. However, case based reasoning has some problems. First, case based reasoning has a tendency to search for a fixed number of neighbors in the observation space and always selects the same number of neighbors rather than the best similar neighbors for the target case. So, case based reasoning may have to take into account more cases even when there are fewer cases applicable depending on the subject. Second, case based reasoning may select neighbors that are far away from the target case. Thus, case based reasoning does not guarantee an optimal pseudo-neighborhood for various target cases, and the predictability can be degraded due to a deviation from the desired similar neighbor. This paper examines how the size of learning data affects stock price predictability through k-nearest neighbor and compares the predictability of k-nearest neighbor with the random walk model according to the size of the learning data and the number of neighbors. In this study, Samsung electronics stock prices were predicted by dividing the learning dataset into two types. For the prediction of next day's closing price, we used four variables: opening value, daily high, daily low, and daily close. In the first experiment, data from January 1, 2000 to December 31, 2017 were used for the learning process. In the second experiment, data from January 1, 2015 to December 31, 2017 were used for the learning process. The test data is from January 1, 2018 to August 31, 2018 for both experiments. We compared the performance of k-NN with the random walk model using the two learning dataset. The mean absolute percentage error (MAPE) was 1.3497 for the random walk model and 1.3570 for the k-NN for the first experiment when the learning data was small. However, the mean absolute percentage error (MAPE) for the random walk model was 1.3497 and the k-NN was 1.2928 for the second experiment when the learning data was large. These results show that the prediction power when more learning data are used is higher than when less learning data are used. Also, this paper shows that k-NN generally produces a better predictive power than random walk model for larger learning datasets and does not when the learning dataset is relatively small. Future studies need to consider macroeconomic variables related to stock price forecasting including opening price, low price, high price, and closing price. Also, to produce better results, it is recommended that the k-nearest neighbor needs to find nearest neighbors using the second step filtering method considering fundamental economic variables as well as a sufficient amount of learning data.

User-Participated Design Method for Perforated Metal Facades using Virtual Reality (가상현실 기반 사용자 참여형 타공패널 파사드 설계 방법론)

  • Jang, Do-Jin;Kim, Seongjun;Kim, Sung-Ah
    • Journal of the Architectural Institute of Korea Planning & Design
    • /
    • v.36 no.4
    • /
    • pp.103-111
    • /
    • 2020
  • Perforated metal sheets are used as panels of facades for controlling environmental factors while ensuring user's visibility. Despite their functional potentials, only a specific direction of facades or an orientation of a building was considered in the relevant studies. This study proposed a design methodology for the perforated panel facades that reflects the location on the facades and the user's requirements. The optimization of quantitative and qualitative performance is achieved through communication between designers and users in a VR system. In optimizing quantitative performances, designers use machine learning techniques such as clustering and genetic algorithm to allocate optimal panels on the facades. In optimizing qualitative performances, through the VR system, users intervene in evaluating performances whose preferences are depending on them. The experiment using the office project showed that designers were able to make decisions based on clustering using GMM to optimize multiple quantitative performances. The gap between the target and final performance could be narrowed by limiting the types of perforated panels considering mass customization. In assessing visibility as a qualitative performance, users were able to participate in the design process using the VR system.

Implementation on the evolutionary machine learning approaches for streamflow forecasting: case study in the Seybous River, Algeria (유출예측을 위한 진화적 기계학습 접근법의 구현: 알제리 세이보스 하천의 사례연구)

  • Zakhrouf, Mousaab;Bouchelkia, Hamid;Stamboul, Madani;Kim, Sungwon;Singh, Vijay P.
    • Journal of Korea Water Resources Association
    • /
    • v.53 no.6
    • /
    • pp.395-408
    • /
    • 2020
  • This paper aims to develop and apply three different machine learning approaches (i.e., artificial neural networks (ANN), adaptive neuro-fuzzy inference systems (ANFIS), and wavelet-based neural networks (WNN)) combined with an evolutionary optimization algorithm and the k-fold cross validation for multi-step (days) streamflow forecasting at the catchment located in Algeria, North Africa. The ANN and ANFIS models yielded similar performances, based on four different statistical indices (i.e., root mean squared error (RMSE), Nash-Sutcliffe efficiency (NSE), correlation coefficient (R), and peak flow criteria (PFC)) for training and testing phases. The values of RMSE and PFC for the WNN model (e.g., RMSE = 8.590 ㎥/sec, PFC = 0.252 for (t+1) day, testing phase) were lower than those of ANN (e.g., RMSE = 19.120 ㎥/sec, PFC = 0.446 for (t+1) day, testing phase) and ANFIS (e.g., RMSE = 18.520 ㎥/sec, PFC = 0.444 for (t+1) day, testing phase) models, while the values of NSE and R for WNN model were higher than those of ANNs and ANFIS models. Therefore, the new approach can be a robust tool for multi-step (days) streamflow forecasting in the Seybous River, Algeria.

A Feature Set Selection Approach Based on Pearson Correlation Coefficient for Real Time Attack Detection (실시간 공격 탐지를 위한 Pearson 상관계수 기반 특징 집합 선택 방법)

  • Kang, Seung-Ho;Jeong, In-Seon;Lim, Hyeong-Seok
    • Convergence Security Journal
    • /
    • v.18 no.5_1
    • /
    • pp.59-66
    • /
    • 2018
  • The performance of a network intrusion detection system using the machine learning method depends heavily on the composition and the size of the feature set. The detection accuracy, such as the detection rate or the false positive rate, of the system relies on the feature composition. And the time it takes to train and detect depends on the size of the feature set. Therefore, in order to enable the system to detect intrusions in real-time, the feature set to beused should have a small size as well as an appropriate composition. In this paper, we show that the size of the feature set can be further reduced without decreasing the detection rate through using Pearson correlation coefficient between features along with the multi-objective genetic algorithm which was used to shorten the size of the feature set in previous work. For the evaluation of the proposed method, the experiments to classify 10 kinds of attacks and benign traffic are performed against NSL_KDD data set.

  • PDF

The Design of Feature Selection Classifier based on Physiological Signal for Emotion Detection (감성판별을 위한 생체신호기반 특징선택 분류기 설계)

  • Lee, JeeEun;Yoo, Sun K.
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.11
    • /
    • pp.206-216
    • /
    • 2013
  • The emotion plays a critical role in human's daily life including learning, action, decision and communication. In this paper, emotion discrimination classifier is designed to reduce system complexity through reduced selection of dominant features from biosignals. The photoplethysmography(PPG), skin temperature, skin conductance, fontal and parietal electroencephalography(EEG) signals were measured during 4 types of movie watching associated with the induction of neutral, sad, fear joy emotions. The genetic algorithm with support vector machine(SVM) based fitness function was designed to determine dominant features among 24 parameters extracted from measured biosignals. It shows maximum classification accuracy of 96.4%, which is 17% higher than that of SVM alone. The minimum error features selected are the mean and NN50 of heart rate variability from PPG signal, the mean of PPG induced pulse transit time, the mean of skin resistance, and ${\delta}$ and ${\beta}$ frequency band powers of parietal EEG. The combination of parietal EEG, PPG, and skin resistance is recommendable in high accuracy instrumentation, while the combinational use of PPG and skin conductance(79% accuracy) is affordable in simplified instrumentation.

A Novel Network Anomaly Detection Method based on Data Balancing and Recursive Feature Addition

  • Liu, Xinqian;Ren, Jiadong;He, Haitao;Wang, Qian;Sun, Shengting
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.7
    • /
    • pp.3093-3115
    • /
    • 2020
  • Network anomaly detection system plays an essential role in detecting network anomaly and ensuring network security. Anomaly detection system based machine learning has become an increasingly popular solution. However, due to the unbalance and high-dimension characteristics of network traffic, the existing methods unable to achieve the excellent performance of high accuracy and low false alarm rate. To address this problem, a new network anomaly detection method based on data balancing and recursive feature addition is proposed. Firstly, data balancing algorithm based on improved KNN outlier detection is designed to select part respective data on each category. Combination optimization about parameters of improved KNN outlier detection is implemented by genetic algorithm. Next, recursive feature addition algorithm based on correlation analysis is proposed to select effective features, in which a cross contingency test is utilized to analyze correlation and obtain a features subset with a strong correlation. Then, random forests model is as the classification model to detection anomaly. Finally, the proposed algorithm is evaluated on benchmark datasets KDD Cup 1999 and UNSW_NB15. The result illustrates the proposed strategies enhance accuracy and recall, and decrease the false alarm rate. Compared with other algorithms, this algorithm still achieves significant effects, especially recall in the small category.

Discovery of User Preference in Recommendation System through Combining Collaborative Filtering and Content based Filtering (협력적 여과와 내용 기반 여과의 병합을 통한 추천 시스템에서의 사용자 선호도 발견)

  • Ko, Su-Jeong;Kim, Jin-Su;Kim, Tae-Yong;Choi, Jun-Hyeog;Lee, Jung-Hyun
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.7 no.6
    • /
    • pp.684-695
    • /
    • 2001
  • Recent recommender system uses a method of combining collaborative filtering system and content based filtering system in order to solve sparsity and first rater problem in collaborative filtering system. Collaborative filtering systems use a database about user preferences to predict additional topics. Content based filtering systems provide recommendations by matching user interests with topic attributes. In this paper, we describe a method for discovery of user preference through combining two techniques for recommendation that allows the application of machine learning algorithm. The proposed collaborative filtering method clusters user using genetic algorithm based on items categorized by Naive Bayes classifier and the content based filtering method builds user profile through extracting user interest using relevance feedback. We evaluate our method on a large database of user ratings for web document and it significantly outperforms previously proposed methods.

  • PDF

Learning Rules for AMR of Collision Avoidance using Fuzzy Classifier System (퍼지 분류자 시스템을 이용한 자율이동로봇의 충돌 회피학습)

  • 반창봉;심귀보
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.10 no.5
    • /
    • pp.506-512
    • /
    • 2000
  • In this paper, we propose a Fuzzy Classifier System(FCS) makes the classifier system be able to carry out the mapping from continuous inputs to outputs. The FCS is based on the fuzzy controller system combined with machine learning. Therefore the antecedent and consequent of a classifier in FCS are the same as those of a fuzzy rule. In this paper, the FCS modifies input message to fuzzified message and stores those in the message list. The FCS constructs rule-base through matching between messages of message list and classifiers of fuzzy classifier list. The FCS verifies the effectiveness of classifiers using Bucket Brigade algorithm. Also the FCS employs the Genetic Algorithms to generate new rules and modifY rules when performance of the system needs to be improved. Then the FCS finds the set of the effective rules. We will verifY the effectiveness of the poposed FCS by applying it to Autonomous Mobile Robot avoiding the obstacle and reaching the goal.

  • PDF

Effective Intrusion Detection using Evolutionary Neural Networks (진화신경망을 이용한 효과적 인 침입탐지)

  • Han Sang-Jun;Cho Sung-Bae
    • Journal of KIISE:Information Networking
    • /
    • v.32 no.3
    • /
    • pp.301-309
    • /
    • 2005
  • Learning program's behavior using machine learning techniques based on system call audit data is an effective intrusion detection method. Rule teaming, neural network, statistical technique, and hidden Markov model are representative methods for intrusion detection. Among them neural networks are known for its good performance in teaming system call sequences. In order to apply it to real world problems successfully, it is important to determine their structure. However, finding appropriate structure requires very long time because there are no formal solutions for determining the structure of networks. In this paper, a novel intrusion detection technique using evolutionary neural networks is proposed. Evolutionary neural networks have the advantage that superior neural networks can be obtained in shorter time than the conventional neural networks because it leams the structure and weights of neural network simultaneously Experimental results against 1999 DARPA IDEVAL data confirm that evolutionary neural networks are effective for intrusion detection.

Learning of Rules for Edge Detection of Image using Fuzzy Classifier System (퍼지 분류가 시스템을 이용한 영상의 에지 검출 규칙 학습)

  • 정치선;반창봉;심귀보
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.10 no.3
    • /
    • pp.252-259
    • /
    • 2000
  • In this paper, we propose a Fuzzy Classifier System(FCS) to find a set of fuzzy rules which can carry out the edge detection of a image. The FCS is based on the fuzzy logic system combined with machine learning. Therefore the antecedent and consequent of a classifier in FCS are the same as those of a fuzzy rule. There are two different approaches, Michigan and Pittsburgh approaches, to acquire appropriate fuzzy rules by evolutionary computation. In this paper, we use the Michigan style in which a single fuzzy if-then rule is coded as an individual. Also the FCS employs the Genetic Algorithms to generate new rules and modify rules when performance of the system needs to be improved. The proposed method is evaluated by applying it to the edge detection of a gray-level image that is a pre-processing step of the computer vision. the differences of average gray-level of the each vertical/horizontal arrays of neighborhood pixels are represented into fuzzy sets, and then the center pixel is decided whether it is edge pixel or not using fuzzy if-then rules. We compare the resulting image with a conventional edge image obtained by the other edge detection method such as Sobel edge detection.

  • PDF