• Title/Summary/Keyword: genetic algorithm

Search Result 4,795, Processing Time 0.038 seconds

Efficient Feature Selection Based Near Real-Time Hybrid Intrusion Detection System (근 실시간 조건을 달성하기 위한 효과적 속성 선택 기법 기반의 고성능 하이브리드 침입 탐지 시스템)

  • Lee, Woosol;Oh, Sangyoon
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.5 no.12
    • /
    • pp.471-480
    • /
    • 2016
  • Recently, the damage of cyber attack toward infra-system, national defence and security system is gradually increasing. In this situation, military recognizes the importance of cyber warfare, and they establish a cyber system in preparation, regardless of the existence of threaten. Thus, the study of Intrusion Detection System(IDS) that plays an important role in network defence system is required. IDS is divided into misuse and anomaly detection methods. Recent studies attempt to combine those two methods to maximize advantagesand to minimize disadvantages both of misuse and anomaly. The combination is called Hybrid IDS. Previous studies would not be inappropriate for near real-time network environments because they have computational complexity problems. It leads to the need of the study considering the structure of IDS that have high detection rate and low computational cost. In this paper, we proposed a Hybrid IDS which combines C4.5 decision tree(misuse detection method) and Weighted K-means algorithm (anomaly detection method) hierarchically. It can detect malicious network packets effectively with low complexity by applying mutual information and genetic algorithm based efficient feature selection technique. Also we construct upgraded the the hierarchical structure of IDS reusing feature weights in anomaly detection section. It is validated that proposed Hybrid IDS ensures high detection accuracy (98.68%) and performance at experiment section.

Design Optimization of Multi-element Airfoil Shapes to Minimize Ice Accretion (결빙 증식 최소화를 위한 다중 익형 형상 최적설계)

  • Kang, Min-Je;Lee, Hyeokjin;Jo, Hyeonseung;Myong, Rho-Shin;Lee, Hakjin
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.50 no.7
    • /
    • pp.445-454
    • /
    • 2022
  • Ice accretion on the aircraft components, such as wings, fuselage, and empennage, can occur when the aircraft encounters a cloud zone with high humidity and low temperature. The prevention of ice accretion is important because it causes a decrease in the aerodynamic performance and flight stability, thus leading to fatal safety problems. In this study, a shape design optimization of a multi-element airfoil is performed to minimize the amount of ice accretion on the high-lift device including leading-edge slat, main element, and trailing-edge flap. The design optimization framework proposed in this paper consists of four major parts: air flow, droplet impingement and ice accretion simulations and gradient-free optimization algorithm. Reynolds-averaged Navier-Stokes (RANS) simulation is used to predict the aerodynamic performance and flow field around the multi-element airfoil at the angle of attack 8°. Droplet impingement and ice accretion simulations are conducted using the multi-physics computational analysis tool. The objective function is to minimize the total mass of ice accretion and the design variables are the deflection angle, gap, and overhang of the flap and slat. Kriging surrogate model is used to construct the response surface, providing rapid approximations of time-consuming function evaluation, and genetic algorithm is employed to find the optimal solution. As a result of optimization, the total mass of ice accretion on the optimized multielement airfoil is reduced by about 8% compared to the baseline configuration.

Optimal Designs of Urban Watershed Boundary and Sewer Networks to Reduce Peak Outflows (첨두유출량 저감을 위한 도시유역 경계 및 우수관망 최적 설계)

  • Lee, Jung-Ho;Jun, Hwan-Don;Kim, Joong-Hoon
    • Journal of the Korean Society of Hazard Mitigation
    • /
    • v.11 no.2
    • /
    • pp.157-161
    • /
    • 2011
  • Although many researches have been carried out concerning the watershed division in natural areas, it has not been researched for the urban watershed division. If the boundary between two urban areas is indistinct because no natural distinction or no administrative division is between the areas, the boundary between the urban areas that have the different outlets (multi-outlet urban watershed) is determined by only designer of sewer system. The suggested urban watershed division model (UWDM) determines the watershed boundary to reduce simultaneously the peak outflows at the outlets of each watershed. Then, the UWDM determines the sewer network to reduce the peak outflow at outlet by determining the pipe connecting directions between the manholes that have the multi-possible pipe connecting directions. In the UWDM, because the modification of the sewer network changes the superposition effect of the runoff hydrographs in sewer pipes, the optimal sewer layout can reduce the peak outflow at outlet, as much as the superposition effects of the hydrographs are reduced. Therefore, the UWDM can optimize the watershed distinction in multi-outlet urban watershed by determining the connecting directions of the boundary-manholes using the genetic algorithm. The suggested model was applied to a multi-outlet urban watershed of 50.3ha, Seoul, Korea, and the watershed division of this model, the peak outflows at two outlets were decreased by approximately 15% for the design rainfall.

An Integrated Model based on Genetic Algorithms for Implementing Cost-Effective Intelligent Intrusion Detection Systems (비용효율적 지능형 침입탐지시스템 구현을 위한 유전자 알고리즘 기반 통합 모형)

  • Lee, Hyeon-Uk;Kim, Ji-Hun;Ahn, Hyun-Chul
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.1
    • /
    • pp.125-141
    • /
    • 2012
  • These days, the malicious attacks and hacks on the networked systems are dramatically increasing, and the patterns of them are changing rapidly. Consequently, it becomes more important to appropriately handle these malicious attacks and hacks, and there exist sufficient interests and demand in effective network security systems just like intrusion detection systems. Intrusion detection systems are the network security systems for detecting, identifying and responding to unauthorized or abnormal activities appropriately. Conventional intrusion detection systems have generally been designed using the experts' implicit knowledge on the network intrusions or the hackers' abnormal behaviors. However, they cannot handle new or unknown patterns of the network attacks, although they perform very well under the normal situation. As a result, recent studies on intrusion detection systems use artificial intelligence techniques, which can proactively respond to the unknown threats. For a long time, researchers have adopted and tested various kinds of artificial intelligence techniques such as artificial neural networks, decision trees, and support vector machines to detect intrusions on the network. However, most of them have just applied these techniques singularly, even though combining the techniques may lead to better detection. With this reason, we propose a new integrated model for intrusion detection. Our model is designed to combine prediction results of four different binary classification models-logistic regression (LOGIT), decision trees (DT), artificial neural networks (ANN), and support vector machines (SVM), which may be complementary to each other. As a tool for finding optimal combining weights, genetic algorithms (GA) are used. Our proposed model is designed to be built in two steps. At the first step, the optimal integration model whose prediction error (i.e. erroneous classification rate) is the least is generated. After that, in the second step, it explores the optimal classification threshold for determining intrusions, which minimizes the total misclassification cost. To calculate the total misclassification cost of intrusion detection system, we need to understand its asymmetric error cost scheme. Generally, there are two common forms of errors in intrusion detection. The first error type is the False-Positive Error (FPE). In the case of FPE, the wrong judgment on it may result in the unnecessary fixation. The second error type is the False-Negative Error (FNE) that mainly misjudges the malware of the program as normal. Compared to FPE, FNE is more fatal. Thus, total misclassification cost is more affected by FNE rather than FPE. To validate the practical applicability of our model, we applied it to the real-world dataset for network intrusion detection. The experimental dataset was collected from the IDS sensor of an official institution in Korea from January to June 2010. We collected 15,000 log data in total, and selected 10,000 samples from them by using random sampling method. Also, we compared the results from our model with the results from single techniques to confirm the superiority of the proposed model. LOGIT and DT was experimented using PASW Statistics v18.0, and ANN was experimented using Neuroshell R4.0. For SVM, LIBSVM v2.90-a freeware for training SVM classifier-was used. Empirical results showed that our proposed model based on GA outperformed all the other comparative models in detecting network intrusions from the accuracy perspective. They also showed that the proposed model outperformed all the other comparative models in the total misclassification cost perspective. Consequently, it is expected that our study may contribute to build cost-effective intelligent intrusion detection systems.

Investigating Dynamic Mutation Process of Issues Using Unstructured Text Analysis (부도예측을 위한 KNN 앙상블 모형의 동시 최적화)

  • Min, Sung-Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.139-157
    • /
    • 2016
  • Bankruptcy involves considerable costs, so it can have significant effects on a country's economy. Thus, bankruptcy prediction is an important issue. Over the past several decades, many researchers have addressed topics associated with bankruptcy prediction. Early research on bankruptcy prediction employed conventional statistical methods such as univariate analysis, discriminant analysis, multiple regression, and logistic regression. Later on, many studies began utilizing artificial intelligence techniques such as inductive learning, neural networks, and case-based reasoning. Currently, ensemble models are being utilized to enhance the accuracy of bankruptcy prediction. Ensemble classification involves combining multiple classifiers to obtain more accurate predictions than those obtained using individual models. Ensemble learning techniques are known to be very useful for improving the generalization ability of the classifier. Base classifiers in the ensemble must be as accurate and diverse as possible in order to enhance the generalization ability of an ensemble model. Commonly used methods for constructing ensemble classifiers include bagging, boosting, and random subspace. The random subspace method selects a random feature subset for each classifier from the original feature space to diversify the base classifiers of an ensemble. Each ensemble member is trained by a randomly chosen feature subspace from the original feature set, and predictions from each ensemble member are combined by an aggregation method. The k-nearest neighbors (KNN) classifier is robust with respect to variations in the dataset but is very sensitive to changes in the feature space. For this reason, KNN is a good classifier for the random subspace method. The KNN random subspace ensemble model has been shown to be very effective for improving an individual KNN model. The k parameter of KNN base classifiers and selected feature subsets for base classifiers play an important role in determining the performance of the KNN ensemble model. However, few studies have focused on optimizing the k parameter and feature subsets of base classifiers in the ensemble. This study proposed a new ensemble method that improves upon the performance KNN ensemble model by optimizing both k parameters and feature subsets of base classifiers. A genetic algorithm was used to optimize the KNN ensemble model and improve the prediction accuracy of the ensemble model. The proposed model was applied to a bankruptcy prediction problem by using a real dataset from Korean companies. The research data included 1800 externally non-audited firms that filed for bankruptcy (900 cases) or non-bankruptcy (900 cases). Initially, the dataset consisted of 134 financial ratios. Prior to the experiments, 75 financial ratios were selected based on an independent sample t-test of each financial ratio as an input variable and bankruptcy or non-bankruptcy as an output variable. Of these, 24 financial ratios were selected by using a logistic regression backward feature selection method. The complete dataset was separated into two parts: training and validation. The training dataset was further divided into two portions: one for the training model and the other to avoid overfitting. The prediction accuracy against this dataset was used to determine the fitness value in order to avoid overfitting. The validation dataset was used to evaluate the effectiveness of the final model. A 10-fold cross-validation was implemented to compare the performances of the proposed model and other models. To evaluate the effectiveness of the proposed model, the classification accuracy of the proposed model was compared with that of other models. The Q-statistic values and average classification accuracies of base classifiers were investigated. The experimental results showed that the proposed model outperformed other models, such as the single model and random subspace ensemble model.

Comparison between Uncertainties of Cultivar Parameter Estimates Obtained Using Error Calculation Methods for Forage Rice Cultivars (오차 계산 방식에 따른 사료용 벼 품종의 품종모수 추정치 불확도 비교)

  • Young Sang Joh;Shinwoo Hyun;Kwang Soo Kim
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.25 no.3
    • /
    • pp.129-141
    • /
    • 2023
  • Crop models have been used to predict yield under diverse environmental and cultivation conditions, which can be used to support decisions on the management of forage crop. Cultivar parameters are one of required inputs to crop models in order to represent genetic properties for a given forage cultivar. The objectives of this study were to compare calibration and ensemble approaches in order to minimize the uncertainty of crop yield estimates using the SIMPLE crop model. Cultivar parameters were calibrated using Log-likelihood (LL) and Generic Composite Similarity Measure (GCSM) as an objective function for Metropolis-Hastings (MH) algorithm. In total, 20 sets of cultivar parameters were generated for each method. Two types of ensemble approach. First type of ensemble approach was the average of model outputs (Eem), using individual parameters. The second ensemble approach was model output (Epm) of cultivar parameter obtained by averaging given 20 sets of parameters. Comparison was done for each cultivar and for each error calculation methods. 'Jowoo' and 'Yeongwoo', which are forage rice cultivars used in Korea, were subject to the parameter calibration. Yield data were obtained from experiment fields at Suwon, Jeonju, Naju and I ksan. Data for 2013, 2014 and 2016 were used for parameter calibration. For validation, yield data reported from 2016 to 2018 at Suwon was used. Initial calibration indicated that genetic coefficients obtained by LL were distributed in a narrower range than coefficients obtained by GCSM. A two-sample t-test was performed to compare between different methods of ensemble approaches and no significant difference was found between them. Uncertainty of GCSM can be neutralized by adjusting the acceptance probability. The other ensemble method (Epm) indicates that the uncertainty can be reduced with less computation using ensemble approach.

Rehabilitation Priority Decision Model for Sewer Systems (하수관거시스템 개량 우선순위 결정 모형)

  • Lee, Jung-Ho;Park, Moo-Jong;Kim, Joong-Hoon
    • Journal of the Korean Society of Hazard Mitigation
    • /
    • v.8 no.6
    • /
    • pp.7-14
    • /
    • 2008
  • The main objective of sewer rehabilitation is to improve its function while eliminating inflow/infiltration (I/I). If we can identify the amount of I/I for an individual pipe, it is possible to estimate the I/Is of sub-areas clearly. However, in real, the amount of I/I for an individual pipe is almost impossible to be obtained due to the limitation of cost and time. In this study, I/I occurrence of each sewer pipe is estimated using AHP (Analytic Hierarch Process) and RPDM (Rehabilitation Priority Decision Model for sewer system) was developed using the estimated I/I of each pipe to perform the efficient sewer rehabilitation. Based on the determined amount of I/I for an individual pipe, the RPDM determines the optimal rehabilitation priority (ORP) using a genetic algorithm for sub-areas in term of minimizing the amount of I/I occurring while the rehabilitation process is performed. The benefit obtained by implementing the ORP for rehabilitation of sub-areas is estimated by the only waste water treatment cost (WWTC) of I/I which occurs during the sewer rehabilitation period. The results of the ORP were compared with those of a numerical weighting method (NWM) which is the decision method for the rehabilitation priority in the general sewer rehabilitation practices and the worst order which are other methods to determine the rehabilitation order of sub-areas in field. The ORP reduced the WWTC by 22% compared to the NWM and by 40% compared to the worst order.

Estimation of Variance Component and Environment Effects on Somatic Cell Scores by Parity in Dairy Cattle (젖소집단의 산차에 따른 체세포점수의 환경효과 및 분산성분 추정)

  • 조광현;나승환;서강석;김시동;박병호;이영창;박종대;손삼규;최재관
    • Journal of Animal Science and Technology
    • /
    • v.48 no.1
    • /
    • pp.39-48
    • /
    • 2006
  • This study utilized test day of somatic cell score data of dairy cattle from 2000 to 2004. The number of data used were 124,635 of first parity, 134,308 of second parity, 77,862 of third parity, 41,787 of forth parity and 37,412 of fifth parity. The data was analyzed by least square mean method using GLM to estimate the effects of calving year, age, lactation stage, parity and season on somatic cell score. Variance component estimation using test day model was determined by using expectation maximization algorithm- restricted maximum likelihood (EM-REML) analysis method. In each parity, somatic cell score was low for younger group and was relatively high in older groups. Likewise, for lactation stage, the score was low in early-lactation and high in late-lactation in first parity and second parity. Nevertheless, for the third, fourth and fifth parity, however, high somatic cell score was observed in mid-lactation. Generally, the score was high in the peak. Although in fourth and fifth parity, the score was low in late-lactation. Environmental effect of season, somatic cell score was generally low from September to November for all parities. The score was high between June and August when the milk production is usually low. The heritability in each parity were 0.05, 0.09, 0.10, 0.05 and 0.05 for parity 1, 2, 3, 4, 5, respectively. Genetic variance value was estimated to be high in second, third and fifth parity in early-lactation and to be low in first and forth parity.

Development of the Dynamic Model for the Metabolic Network of Clostridium acetobutylicum (Clostridium acetobutylicum의 대사망의 동적모델 개발)

  • Kim, Woohyun;Eom, Moon-Ho;Lee, Sang-Hyun;Choi, Jin-Dal-Rae;Park, Sunwon
    • Korean Chemical Engineering Research
    • /
    • v.51 no.2
    • /
    • pp.226-232
    • /
    • 2013
  • To produce biobutanol, fermentation processes using clostridia that mainly produce acetone, butanol and ethanol are used. In this work, a dynamic model describing the metabolic reactions in an acetone-butanol-ethanol (ABE)-producing clostridium, Clostridium acetobutylicum ATCC824, was proposed. To estimate the 58 kinetic parameters of the metabolic network model with experimental data obtained from a batch fermentor, we used an efficient optimization method combining a genetic algorithm and the Levenberg-Marquardt method because of the complexity of the metabolism of the clostridium. For the verification of the determined parameters, the developed metabolic model was evaluated by experiments where genetically modified clostridium was used and the initial concentration of glucose was changed. Consequently, we found that the developed kinetic model for the metabolic network was considered to describe the dynamic metabolic state of the clostridium sufficiently. Thus, this dynamic model for the metabolic reactions will contribute to designing the clostridium as well as the fermentor for higher productivity.

Large-scale Virtual Power Plant Management Method Considering Variable and Sensitive Loads (가변 및 민감성 부하를 고려한 대단위 가상 발전소 운영 방법)

  • Park, Yong Kuk;Lee, Min Goo;Jung, Kyung Kwon;Lee, Yong-Gu
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.5
    • /
    • pp.225-234
    • /
    • 2015
  • Nowadays a Virtual Power Plant (VPP) represents an aggregation of distributed energy resource such as Distributed Generation (DG), Combined Heat and Power generation (CHP), Energy Storage Systems (ESS) and load in order to operate as a single power plant by using Information and Communication Technologies, ICT. The VPP has been developed and verified based on a single virtual plant platform which is connected with a number of various distributed energy resources. As the VPP's distributed energy resources increase, so does the number of data from distributed energy. Moreover, it is obviously inefficient in the aspects of technique and cost that a virtual plant platform operates in a centralized manner over widespread region. In this paper the concept of the large-scale VPP which can reduce a error probability of system's load and increase the robustness of data exchange among distributed energy resources will be proposed. In addition, it can directly control and supervise energy resource by making small size's virtual platform which can make a optimal resource scheduling to consider of variable and sensitive load in the large-scale VPP. It makes certain the result is verified by simulation.