• Title/Summary/Keyword: Wrapper

Search Result 187, Processing Time 0.027 seconds

Classification Performance Improvement of UNSW-NB15 Dataset Based on Feature Selection (특징선택 기법에 기반한 UNSW-NB15 데이터셋의 분류 성능 개선)

  • Lee, Dae-Bum;Seo, Jae-Hyun
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.5
    • /
    • pp.35-42
    • /
    • 2019
  • Recently, as the Internet and various wearable devices have appeared, Internet technology has contributed to obtaining more convenient information and doing business. However, as the internet is used in various parts, the attack surface points that are exposed to attacks are increasing, Attempts to invade networks aimed at taking unfair advantage, such as cyber terrorism, are also increasing. In this paper, we propose a feature selection method to improve the classification performance of the class to classify the abnormal behavior in the network traffic. The UNSW-NB15 dataset has a rare class imbalance problem with relatively few instances compared to other classes, and an undersampling method is used to eliminate it. We use the SVM, k-NN, and decision tree algorithms and extract a subset of combinations with superior detection accuracy and RMSE through training and verification. The subset has recall values of more than 98% through the wrapper based experiments and the DT_PSO showed the best performance.

The Credit Information Feature Selection Method in Default Rate Prediction Model for Individual Businesses (개인사업자 부도율 예측 모델에서 신용정보 특성 선택 방법)

  • Hong, Dongsuk;Baek, Hanjong;Shin, Hyunjoon
    • Journal of the Korea Society for Simulation
    • /
    • v.30 no.1
    • /
    • pp.75-85
    • /
    • 2021
  • In this paper, we present a deep neural network-based prediction model that processes and analyzes the corporate credit and personal credit information of individual business owners as a new method to predict the default rate of individual business more accurately. In modeling research in various fields, feature selection techniques have been actively studied as a method for improving performance, especially in predictive models including many features. In this paper, after statistical verification of macroeconomic indicators (macro variables) and credit information (micro variables), which are input variables used in the default rate prediction model, additionally, through the credit information feature selection method, the final feature set that improves prediction performance was identified. The proposed credit information feature selection method as an iterative & hybrid method that combines the filter-based and wrapper-based method builds submodels, constructs subsets by extracting important variables of the maximum performance submodels, and determines the final feature set through prediction performance analysis of the subset and the subset combined set.

New Automatic Taxonomy Generation Algorithm for the Audio Genre Classification (음악 장르 분류를 위한 새로운 자동 Taxonomy 구축 알고리즘)

  • Choi, Tack-Sung;Moon, Sun-Kook;Park, Young-Cheol;Youn, Dae-Hee;Lee, Seok-Pil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.3
    • /
    • pp.111-118
    • /
    • 2008
  • In this paper, we propose a new automatic taxonomy generation algorithm for the audio genre classification. The proposed algorithm automatically generates hierarchical taxonomy based on the estimated classification accuracy at all possible nodes. The estimation of classification accuracy in the proposed algorithm is conducted by applying the training data to classifier using k-fold cross validation. Subsequent classification accuracy is then to be tested at every node which consists of two clusters by applying one-versus-one support vector machine. In order to assess the performance of the proposed algorithm, we extracted various features which represent characteristics such as timbre, rhythm, pitch and so on. Then, we investigated classification performance using the proposed algorithm and previous flat classifiers. The classification accuracy reaches to 89 percent with proposed scheme, which is 5 to 25 percent higher than the previous flat classification methods. Using low-dimensional feature vectors, in particular, it is 10 to 25 percent higher than previous algorithms for classification experiments.

Systems for Pill Recognition and Medication Management using Deep Learning (딥러닝을 활용한 알약인식 및 복용관리 시스템)

  • Kang-Hee Kim;So-Hyeon Kim;Da-Ham Jung;Bo-Kyung Lee
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.24 no.1
    • /
    • pp.9-16
    • /
    • 2024
  • It is difficult to know the efficacy of pills if the pill bag or wrapper is lost after purchasing the pill. Many people do not classify the use of commercial pills when storing them after purchasing and taking them, so the inaccessibility of information on the side effects of pills leads to misuse of pills. Even with existing applications that search and provide information about pills, users have to select the details of the pills themselves. In this paper, we develope a pill recognition application by building a model that learns the formulation and colour of 22,000 photos of pills provided by a Pharmaceutical Information Institution to solve the above situation. We also develope a pill medication management function.

Ensemble Based Optimal Feature Selection Algorithm for Efficient Intrusion Detection in Wireless Sensor Network

  • Shyam Sundar S;R.S. Bhuvaneswaran;SaiRamesh L
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.8
    • /
    • pp.2214-2229
    • /
    • 2024
  • Wireless sensor network (WSN) consists of large number of sensor nodes that are deployed in geographical locations to collect sensed information, process data and communicate it to the control station for further processing. Due the unfriendly environment where the sensors are deployed, there exist many possibilities of malicious nodes which performs malicious activities in the network. Therefore, the security threats affect performance and life time of sensor networks, whereas various security aspects are there to address security issues in WSN namely Cryptography, Trust Management, Intrusion Detection System (IDS) and Intrusion Prevention Systems (IPS). However, IDS detect the malicious activities and produce an alarm. These malicious activities exploit vulnerabilities in the network layer and affect all layers in the network. Existing feature selection methods such as filter-based methods are not considering the redundancy of the selected features and wrapper method has high risk of overfitting the classification of intrusion. Due to overfitting, the classification algorithm fails to detect the intrusion in better manner. The main objective of this paper is to provide the efficient feature selection algorithm which was suitable for any type classification algorithm to detect the intrusion in an effective manner. This paper, the security of the network is addressed by proposing Feature Selection Algorithm using Chi Squared with Ensemble Method (FSChE). The proposed scheme employs the combination of decision tree along with the random forest classification algorithm to form ensemble classifier. The experimental results justify the feasibility of the proposed scheme in terms of attack detection, packet delivery ratio and time analysis by employing NSL KDD cup data Set. The obtained results shows that the proposed ensemble method increases the overall performance by 10% to 25% with respect to mentioned parameters.

A Study on the Prediction Model of Stock Price Index Trend based on GA-MSVM that Simultaneously Optimizes Feature and Instance Selection (입력변수 및 학습사례 선정을 동시에 최적화하는 GA-MSVM 기반 주가지수 추세 예측 모형에 관한 연구)

  • Lee, Jong-sik;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.4
    • /
    • pp.147-168
    • /
    • 2017
  • There have been many studies on accurate stock market forecasting in academia for a long time, and now there are also various forecasting models using various techniques. Recently, many attempts have been made to predict the stock index using various machine learning methods including Deep Learning. Although the fundamental analysis and the technical analysis method are used for the analysis of the traditional stock investment transaction, the technical analysis method is more useful for the application of the short-term transaction prediction or statistical and mathematical techniques. Most of the studies that have been conducted using these technical indicators have studied the model of predicting stock prices by binary classification - rising or falling - of stock market fluctuations in the future market (usually next trading day). However, it is also true that this binary classification has many unfavorable aspects in predicting trends, identifying trading signals, or signaling portfolio rebalancing. In this study, we try to predict the stock index by expanding the stock index trend (upward trend, boxed, downward trend) to the multiple classification system in the existing binary index method. In order to solve this multi-classification problem, a technique such as Multinomial Logistic Regression Analysis (MLOGIT), Multiple Discriminant Analysis (MDA) or Artificial Neural Networks (ANN) we propose an optimization model using Genetic Algorithm as a wrapper for improving the performance of this model using Multi-classification Support Vector Machines (MSVM), which has proved to be superior in prediction performance. In particular, the proposed model named GA-MSVM is designed to maximize model performance by optimizing not only the kernel function parameters of MSVM, but also the optimal selection of input variables (feature selection) as well as instance selection. In order to verify the performance of the proposed model, we applied the proposed method to the real data. The results show that the proposed method is more effective than the conventional multivariate SVM, which has been known to show the best prediction performance up to now, as well as existing artificial intelligence / data mining techniques such as MDA, MLOGIT, CBR, and it is confirmed that the prediction performance is better than this. Especially, it has been confirmed that the 'instance selection' plays a very important role in predicting the stock index trend, and it is confirmed that the improvement effect of the model is more important than other factors. To verify the usefulness of GA-MSVM, we applied it to Korea's real KOSPI200 stock index trend forecast. Our research is primarily aimed at predicting trend segments to capture signal acquisition or short-term trend transition points. The experimental data set includes technical indicators such as the price and volatility index (2004 ~ 2017) and macroeconomic data (interest rate, exchange rate, S&P 500, etc.) of KOSPI200 stock index in Korea. Using a variety of statistical methods including one-way ANOVA and stepwise MDA, 15 indicators were selected as candidate independent variables. The dependent variable, trend classification, was classified into three states: 1 (upward trend), 0 (boxed), and -1 (downward trend). 70% of the total data for each class was used for training and the remaining 30% was used for verifying. To verify the performance of the proposed model, several comparative model experiments such as MDA, MLOGIT, CBR, ANN and MSVM were conducted. MSVM has adopted the One-Against-One (OAO) approach, which is known as the most accurate approach among the various MSVM approaches. Although there are some limitations, the final experimental results demonstrate that the proposed model, GA-MSVM, performs at a significantly higher level than all comparative models.

Design and Implementation of DEVSim++ and DiskSim Interface for Interoperation of System-level Simulation and Disk I/O-level Simulation (시스템수준 시뮬레이션과 디스크 I/O수준 시뮬레이션 연동을 위한 DEVSim++과 DiskSim 사이의 인터페이스 설계 및 구현)

  • Song, Hae Sang;Lee, Sun Ju
    • Journal of the Korea Society of Computer and Information
    • /
    • v.18 no.4
    • /
    • pp.131-140
    • /
    • 2013
  • This paper deals with the design and implementation of an interface for interoperation between DiskSim, a well-known disk simulator, and a system-level simulator based on DEVSim++. Such inter-operational simulation aims at evaluation of an overall performance of storage systems which consist of multiple computer nodes with a variety of I/O level specifications. A well-known system-level simulation framework, DEVSim++ environment is based on the DEVS formalism, which provides a sound semantics of modular and hierarchical modeling methodology at the discrete event systems level such as multi-node computer systems. For maintainability we assume that there is no change of the source codes for two heterogeneous simulation engines. Thus, we adopt a notion of simulators interoperation in which there should be a means to synchronize simulation times as well as to exchange messages between simulators. As an interface for such interoperation DiskSimManager is designed and implemented. Various experiments, comparing the results of the standalone DiskSim simulation and the interoperation simulation using the proposed interface of DiskSimManager, proved that DiskSimManager works correctly as an interface for interoperation between DEVSim++ and DiskSim.

An Analysis of Suicidal Accidents on Psychiatric In-patients (입원중 정신병 환자의 자살사고 요인 분석)

  • 이평숙
    • Journal of Korean Academy of Nursing
    • /
    • v.5 no.2
    • /
    • pp.11-22
    • /
    • 1975
  • Suicides have been considered to be one of the grave problems of modern societies. According to recent police statistics of Republic of Korea, 28.6 suicides in every 100, 000 were reported. Psychiatric Patients are believed to be predisposed to suicidal tendencies. This study was performed to investigate the characteristics of suicidal attempts and to analyse the environmental factors involved in the suicidal accidents of patients admitted to psychiatric hospitals. Records of 66 suicidal accidents from three psychiatric hospitals during the period of January 1971 through June 1971 were sampled. Data were analysed by percentile score. Results are as follows; 1. The age group of 21~30yrs. was revealed to be the highest in frequency of suicidal attempts (50.0%) Among the unsuccessful suicides; the age group of 31~40yrs. in men and the age group of 21~30 yrs. in women revealed to be the highest in frequency-Among the unsuccessful suicides; the age group of 21-30 yrs. in both sex revealed to be the highest in frequency. 2. Suicidal accidents occurred more frequently to the unmarried(63.6%)- Among the successful suicides; higher frequency was shown to unmarried in men and the frequency is contrasted in women. Among the unsuccessful; the unmarried in both sex were revealed to be highest in frequency. 3. Schizophrenia was revealed the highest of suicidal attempts in frequency (81.8%). 4, Suicides were most frequently attempted in the spring(46, 9%). Among the successful suicides; highest frequencies were shown in men in tile winter and in women in the summer season. Among the unsuccessful suicides :highest frequencies were shown in men in the winter and in women in the spring. 5. Suicidal attempts were most frequently occurred in hospital wards (40.9%), In women, unsuccessful attempts were found to be the highest on authorized leave at their homes. 6. The hanging was revealed to be the most frequently adopted methods for suicidal attempts (31.8%). Among the successful suicides; hanging was the most frequent method adopted in men white in women the drug over-dose, Among the unsuccessful suicides ; stabbing by sharp devices while in women drug-overdosage was adopted as well 7. The most frequently adopted instrument of different suicidal attempts were: house-hold wrapper (26.3%) in cases of hanging, knives (31.8%) in cases of stabbing, and drugs. 8. The suicidal attempts have occurred most frequently at dawn through early morning (2-6A. M.) (34.8%). Among the successful suicides i most frequent time of occurrence on week-days were revealed to be dawn, while on holidays the occurrence were in the evening as well Among the unsuccessful; the most frequent time of occurrence was the day hours while on holidays at dawn. 9. Suicidal attempts within the hospital ward were first noticed by nurses most frequently (42.2%). 10. Manifestations such as restlessness, depression, self-depreciation were revealed to be the most frequent pre-suicidal attempt behavior characteristics. 11. Among the successful suicides ; manifestations of physical damage were found on the neck while among unsuccessful attempts, the damages were found on exterminates.

  • PDF

The Application of HACCP System to Soybean Curd and Its Effectiveness (두부류에 대한 HACCP 적용 및 성과)

  • Park, Wan-Hee;Lee, Sung-Hak
    • Journal of Food Hygiene and Safety
    • /
    • v.18 no.4
    • /
    • pp.202-210
    • /
    • 2003
  • This study aims at making a HACCP(Hazard Analysis Critical Control Point)plan to be applied to soybean curd and verifing its effectiveness. First, we develped a general model of HACCP according to the guidelines of Codex (FAO/WHO). And we applied the model to 4 soybean curd workshops for 3 months. The HACCP model is composed of these procedures; HACCP team organization, production description, work flow chart, hazatd analysis, CCP (critical control point) decision, CL (critical limit) establishment, monitoring method decision, correction, verification and documentation. CCP were selection procedure and refrigeration procedure in non-wrapped soybean curd. CCP were selection procedure, heat-sterilizing and refrigeration in wrapped soybean curd. The result of bacterial experiment after apling the model for 3 months, the bacterial numbers of soybean curd box, wrapper, and soybean curd production were lower after appling than before appling, the model. We could verify that the appications of the HACCP model were effective to the soybean curd workshops.

Studies on the Forage Production and Utilization on Paddy Field in Korea (한국에 있어서 답리작을 이용한 양질 조사료 생산기술)

  • Seo, Sung;Yook, W.B.
    • Proceedings of the Korean Society of Grassland Science Conference
    • /
    • 2002.09b
    • /
    • pp.5-56
    • /
    • 2002
  • The problems in the current domestic forage production were evaluated, and the prospective improvement was suggested in this paper. Grassland development in forest, production of high quality forages in upland and paddy land, efficient utilization of rice straw, development of new varieties of forages suitable for our environmental conditions and imported forages were described Among them, preferential production and utilization of forages using paddy field after rice harvest . should be enlarged for domestic supply of forages in Korea. Several studies were carried out to select the promising forage crops and barley cultivars for whole crop silage production, to determine productivity, nutritive value and production cost of forages produced in paddy field, and feeding effect of forages with Hanuwoo and milking cow for whole crop silage with forages produced in paddy field, 1999 to 2001, and also discussed restraint factors and activation plans for enlargement of forage production in paddy land. The promising forage crops in paddy field were rye and barley for Middle region, and rye, barley, early maturing Italian ryegrass and wheat for Southern region. The promising barley cultivars for whole crop silage in paddy field were Albori in Suwon, Keunalbori, Milyang 92, Saessalbori, and Naehanssalbori in Iksan, and Keunalbori, Albori, Naehanssalbori, and Saegangbori in Milyang, respectively. Silage production, quality and animal palatability of silage by trench and round bale were also compared. The production yields of whole crop barley silage(WBS) were 17,135kg as a fresh matter, and 6,011kg as a dry matter per ha, and the quality of WBS was 2∼3 grade, while that of rice straw silage was 4 grade as a farm basis. The production cost of WBS per kg was 83won as a fresh matter, and 238won as a dry matter. Feeding of WBS as forages on Hanwoo was very desirable for the improvement of live-weight gain, beef quality and farm income, particularly in growing stage of Hanwoo. Milk production and income were also increased, and feed cost was decreased by feeding of WBS. The daily voluntary intake of WBS in milking cow was 26.3kg as a fresh matter(DM 7.7kg) per head. Milk production when WBS was fed, was very similar to that of imported hay feeding such as Kentucky bluegrass or domestic corn silage. The issues to be solved in near future f3r stable forage production and supply in paddy land are sustainable livestock-forages policy, development & seed production of new varieties of barley, rye, Italian ryegrass and other promising forages, efficient demand & supply system of forages, solidification for mass production and utilization of forages, efficient application management of animal slurry on paddy field considering environmental agriculture/livestock industry, and break k development of bottleneck technique in production field. Domestic production & supply of high cost agricultural machine (round baler, wrapper, handler and so on), plastic wrapping film, and silage additives are also important.

  • PDF