• 제목/요약/키워드: machine learning classification

검색결과 1,458건 처리시간 0.028초

Genetic Algorithm Application to Machine Learning

  • Han, Myung-mook;Lee, Yill-byung
    • 한국지능시스템학회논문지
    • /
    • 제11권7호
    • /
    • pp.633-640
    • /
    • 2001
  • In this paper we examine the machine learning issues raised by the domain of the Intrusion Detection Systems(IDS), which have difficulty successfully classifying intruders. There systems also require a significant amount of computational overhead making it difficult to create robust real-time IDS. Machine learning techniques can reduce the human effort required to build these systems and can improve their performance. Genetic algorithms are used to improve the performance of search problems, while data mining has been used for data analysis. Data Mining is the exploration and analysis of large quantities of data to discover meaningful patterns and rules. Among the tasks for data mining, we concentrate the classification task. Since classification is the basic element of human way of thinking, it is a well-studied problem in a wide variety of application. In this paper, we propose a classifier system based on genetic algorithm, and the proposed system is evaluated by applying it to IDS problem related to classification task in data mining. We report our experiments in using these method on KDD audit data.

  • PDF

The Investigation of Employing Supervised Machine Learning Models to Predict Type 2 Diabetes Among Adults

  • Alhmiedat, Tareq;Alotaibi, Mohammed
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권9호
    • /
    • pp.2904-2926
    • /
    • 2022
  • Currently, diabetes is the most common chronic disease in the world, affecting 23.7% of the population in the Kingdom of Saudi Arabia. Diabetes may be the cause of lower-limb amputations, kidney failure and blindness among adults. Therefore, diagnosing the disease in its early stages is essential in order to save human lives. With the revolution in technology, Artificial Intelligence (AI) could play a central role in the early prediction of diabetes by employing Machine Learning (ML) technology. In this paper, we developed a diagnosis system using machine learning models for the detection of type 2 diabetes among adults, through the adoption of two different diabetes datasets: one for training and the other for the testing, to analyze and enhance the prediction accuracy. This work offers an enhanced classification accuracy as a result of employing several pre-processing methods before applying the ML models. According to the obtained results, the implemented Random Forest (RF) classifier offers the best classification accuracy with a classification score of 98.95%.

Application of data mining and statistical measurement of agricultural high-quality development

  • Yan Zhou
    • Advances in nano research
    • /
    • 제14권3호
    • /
    • pp.225-234
    • /
    • 2023
  • In this study, we aim to use big data resources and statistical analysis to obtain a reliable instruction to reach high-quality and high yield agricultural yields. In this regard, soil type data, raining and temperature data as well as wheat production in each year are collected for a specific region. Using statistical methodology, the acquired data was cleaned to remove incomplete and defective data. Afterwards, using several classification methods in machine learning we tried to distinguish between different factors and their influence on the final crop yields. Comparing the proposed models' prediction using statistical quantities correlation factor and mean squared error between predicted values of the crop yield and actual values the efficacy of machine learning methods is discussed. The results of the analysis show high accuracy of machine learning methods in the prediction of the crop yields. Moreover, it is indicated that the random forest (RF) classification approach provides best results among other classification methods utilized in this study.

Classification of Network Traffic using Machine Learning for Software Defined Networks

  • Muhammad Shahzad Haroon;Husnain Mansoor
    • International Journal of Computer Science & Network Security
    • /
    • 제23권12호
    • /
    • pp.91-100
    • /
    • 2023
  • As SDN devices and systems hit the market, security in SDN must be raised on the agenda. SDN has become an interesting area in both academics and industry. SDN promises many benefits which attract many IT managers and Leading IT companies which motivates them to switch to SDN. Over the last three decades, network attacks becoming more sophisticated and complex to detect. The goal is to study how traffic information can be extracted from an SDN controller and open virtual switches (OVS) using SDN mechanisms. The testbed environment is created using the RYU controller and Mininet. The extracted information is further used to detect these attacks efficiently using a machine learning approach. To use the Machine learning approach, a dataset is required. Currently, a public SDN based dataset is not available. In this paper, SDN based dataset is created which include legitimate and non-legitimate traffic. Classification is divided into two categories: binary and multiclass classification. Traffic has been classified with or without dimension reduction techniques like PCA and LDA. Our approach provides 98.58% of accuracy using a random forest algorithm.

일반엑스선검사 교육용 시뮬레이터 개발을 위한 기계학습 분류모델 비교 (Comparison of Machine Learning Classification Models for the Development of Simulators for General X-ray Examination Education)

  • 이인자;박채연;이준호
    • 대한방사선기술학회지:방사선기술과학
    • /
    • 제45권2호
    • /
    • pp.111-116
    • /
    • 2022
  • In this study, the applicability of machine learning for the development of a simulator for general X-ray examination education is evaluated. To this end, k-nearest neighbor(kNN), support vector machine(SVM) and neural network(NN) classification models are analyzed to present the most suitable model by analyzing the results. Image data was obtained by taking 100 photos each corresponding to Posterior anterior(PA), Posterior anterior oblique(Obl), Lateral(Lat), Fan lateral(Fan lat). 70% of the acquired 400 image data were used as training sets for learning machine learning models and 30% were used as test sets for evaluation. and prediction model was constructed for right-handed PA, Obl, Lat, Fan lat image classification. Based on the data set, after constructing the classification model using the kNN, SVM, and NN models, each model was compared through an error matrix. As a result of the evaluation, the accuracy of kNN was 0.967 area under curve(AUC) was 0.993, and the accuracy of SVM was 0.992 AUC was 1.000. The accuracy of NN was 0.992 and AUC was 0.999, which was slightly lower in kNN, but all three models recorded high accuracy and AUC. In this study, right-handed PA, Obl, Lat, Fan lat images were classified and predicted using the machine learning classification models, kNN, SVM, and NN models. The prediction showed that SVM and NN were the same at 0.992, and AUC was similar at 1.000 and 0.999, indicating that both models showed high predictive power and were applicable to educational simulators.

Shield TBM disc cutter replacement and wear rate prediction using machine learning techniques

  • Kim, Yunhee;Hong, Jiyeon;Shin, Jaewoo;Kim, Bumjoo
    • Geomechanics and Engineering
    • /
    • 제29권3호
    • /
    • pp.249-258
    • /
    • 2022
  • A disc cutter is an excavation tool on a tunnel boring machine (TBM) cutterhead; it crushes and cuts rock mass while the machine excavates using the cutterhead's rotational movement. Disc cutter wear occurs naturally. Thus, along with the management of downtime and excavation efficiency, abrasioned disc cutters need to be replaced at the proper time; otherwise, the construction period could be delayed and the cost could increase. The most common prediction models for TBM performance and for the disc cutter lifetime have been proposed by the Colorado School of Mines and Norwegian University of Science and Technology. However, design parameters of existing models do not well correspond to the field values when a TBM encounters complex and difficult ground conditions in the field. Thus, this study proposes a series of machine learning models to predict the disc cutter lifetime of a shield TBM using the excavation (machine) data during operation which is response to the rock mass. This study utilizes five different machine learning techniques: four types of classification models (i.e., K-Nearest Neighbors (KNN), Support Vector Machine, Decision Tree, and Staking Ensemble Model) and one artificial neural network (ANN) model. The KNN model was found to be the best model among the four classification models, affording the highest recall of 81%. The ANN model also predicted the wear rate of disc cutters reasonably well.

IMU 원신호 기반의 기계학습을 통한 충격전 낙상방향 분류 (Classification of Fall Direction Before Impact Using Machine Learning Based on IMU Raw Signals)

  • 이현빈;이창준;이정근
    • 센서학회지
    • /
    • 제31권2호
    • /
    • pp.96-101
    • /
    • 2022
  • As the elderly population gradually increases, the risk of fatal fall accidents among the elderly is increasing. One way to cope with a fall accident is to determine the fall direction before impact using a wearable inertial measurement unit (IMU). In this context, a previous study proposed a method of classifying fall directions using a support vector machine with sensor velocity, acceleration, and tilt angle as input parameters. However, in this method, the IMU signals are processed through several processes, including a Kalman filter and the integration of acceleration, which involves a large amount of computation and error factors. Therefore, this paper proposes a machine learning-based method that classifies the fall direction before impact using IMU raw signals rather than processed data. In this study, we investigated the effects of the following two factors on the classification performance: (1) the usage of processed/raw signals and (2) the selection of machine learning techniques. First, as a result of comparing the processed/raw signals, the difference in sensitivities between the two methods was within 5%, indicating an equivalent level of classification performance. Second, as a result of comparing six machine learning techniques, K-nearest neighbor and naive Bayes exhibited excellent performance with a sensitivity of 86.0% and 84.1%, respectively.

기계학습에 기초한 국내 학술지 논문의 자동분류에 관한 연구 (An Analytical Study on Automatic Classification of Domestic Journal articles Based on Machine Learning)

  • 김판준
    • 정보관리학회지
    • /
    • 제35권2호
    • /
    • pp.37-62
    • /
    • 2018
  • 문헌정보학 분야의 국내 학술지 논문으로 구성된 문헌집합을 대상으로 기계학습에 기초한 자동분류의 성능에 영향을 미치는 요소들을 검토하였다. 특히, "정보관리학회지"에 수록된 논문에 주제 범주를 자동 할당하는 분류 성능 측면에서 용어 가중치부여 기법, 학습집합 크기, 분류 알고리즘, 범주 할당 방법 등 주요 요소들의 특성을 다각적인 실험을 통해 살펴보았다. 결과적으로 분류 환경 및 문헌집합의 특성에 따라 각 요소를 적절하게 적용하는 것이 효과적이며, 보다 단순한 모델의 사용으로 상당히 좋은 수준의 성능을 도출할 수 있었다. 또한, 국내 학술지 논문의 분류는 특정 논문에 하나 이상의 범주를 할당하는 복수-범주 분류(multi-label classification)가 실제 환경에 부합한다고 할 수 있다. 따라서 이러한 환경을 고려하여 단순하고 빠른 분류 알고리즘과 소규모의 학습집합을 사용하는 최적의 분류 모델을 제안하였다.

EXTRACTING INSIGHTS OF CLASSIFICATION FOR TURING PATTERN WITH FEATURE ENGINEERING

  • OH, SEOYOUNG;LEE, SEUNGGYU
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • 제24권3호
    • /
    • pp.321-330
    • /
    • 2020
  • Data classification and clustering is one of the most common applications of the machine learning. In this paper, we aim to provide the insight of the classification for Turing pattern image, which has high nonlinearity, with feature engineering using the machine learning without a multi-layered algorithm. For a given image data X whose fixel values are defined in [-1, 1], X - X3 and ∇X would be more meaningful feature than X to represent the interface and bulk region for a complex pattern image data. Therefore, we use X - X3 and ∇X in the neural network and clustering algorithm to classification. The results validate the feasibility of the proposed approach.

Scaling Up Face Masks Classification Using a Deep Neural Network and Classical Method Inspired Hybrid Technique

  • Kumar, Akhil;Kalia, Arvind;Verma, Kinshuk;Sharma, Akashdeep;Kaushal, Manisha;Kalia, Aayushi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권11호
    • /
    • pp.3658-3679
    • /
    • 2022
  • Classification of persons wearing and not wearing face masks in images has emerged as a new computer vision problem during the COVID-19 pandemic. In order to address this problem and scale up the research in this domain, in this paper a hybrid technique by employing ResNet-101 and multi-layer perceptron (MLP) classifier has been proposed. The proposed technique is tested and validated on a self-created face masks classification dataset and a standard dataset. On self-created dataset, the proposed technique achieved a classification accuracy of 97.3%. To embrace the proposed technique, six other state-of-the-art CNN feature extractors with six other classical machine learning classifiers have been tested and compared with the proposed technique. The proposed technique achieved better classification accuracy and 1-6% higher precision, recall, and F1 score as compared to other tested deep feature extractors and machine learning classifiers.