• Title/Summary/Keyword: Random Forest Algorithm

Search Result 222, Processing Time 0.033 seconds

Machine Learning Model for Predicting the Residual Useful Lifetime of the CNC Milling Insert (공작기계의 절삭용 인서트의 잔여 유효 수명 예측 모형)

  • Won-Gun Choi;Heungseob Kim;Bong Jin Ko
    • Journal of Advanced Navigation Technology
    • /
    • v.27 no.1
    • /
    • pp.111-118
    • /
    • 2023
  • For the implementation of a smart factory, it is necessary to collect data by connecting various sensors and devices in the manufacturing environment and to diagnose or predict failures in production facilities through data analysis. In this paper, to predict the residual useful lifetime of milling insert used for machining products in CNC machine, weight k-NN algorithm, Decision Tree, SVR, XGBoost, Random forest, 1D-CNN, and frequency spectrum based on vibration signal are investigated. As the results of the paper, the frequency spectrum does not provide a reliable criterion for an accurate prediction of the residual useful lifetime of an insert. And the weighted k-nearest neighbor algorithm performed best with an MAE of 0.0013, MSE of 0.004, and RMSE of 0.0192. This is an error of 0.001 seconds of the remaining useful lifetime of the insert predicted by the weighted-nearest neighbor algorithm, and it is considered to be a level that can be applied to actual industrial sites.

Performance Comparison of Machine Learning Algorithms for Network Traffic Security in Medical Equipment (의료기기 네트워크 트래픽 보안 관련 머신러닝 알고리즘 성능 비교)

  • Seung Hyoung Ko;Joon Ho Park;Da Woon Wang;Eun Seok Kang;Hyun Wook Han
    • Journal of Information Technology Services
    • /
    • v.22 no.5
    • /
    • pp.99-108
    • /
    • 2023
  • As the computerization of hospitals becomes more advanced, security issues regarding data generated from various medical devices within hospitals are gradually increasing. For example, because hospital data contains a variety of personal information, attempts to attack it have been continuously made. In order to safely protect data from external attacks, each hospital has formed an internal team to continuously monitor whether the computer network is safely protected. However, there are limits to how humans can monitor attacks that occur on networks within hospitals in real time. Recently, artificial intelligence models have shown excellent performance in detecting outliers. In this paper, an experiment was conducted to verify how well an artificial intelligence model classifies normal and abnormal data in network traffic data generated from medical devices. There are several models used for outlier detection, but among them, Random Forest and Tabnet were used. Tabnet is a deep learning algorithm related to receive and classify structured data. Two algorithms were trained using open traffic network data, and the classification accuracy of the model was measured using test data. As a result, the random forest algorithm showed a classification accuracy of 93%, and Tapnet showed a classification accuracy of 99%. Therefore, it is expected that most outliers that may occur in a hospital network can be detected using an excellent algorithm such as Tabnet.

Accuracy Evaluation of Supervised Classification by Using Morphological Attribute Profiles and Additional Band of Hyperspectral Imagery (초분광 영상의 Morphological Attribute Profiles와 추가 밴드를 이용한 감독분류의 정확도 평가)

  • Park, Hong Lyun;Choi, Jae Wan
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.25 no.1
    • /
    • pp.9-17
    • /
    • 2017
  • Hyperspectral imagery is used in the land cover classification with the principle component analysis and minimum noise fraction to reduce the data dimensionality and noise. Recently, studies on the supervised classification using various features having spectral information and spatial characteristic have been carried out. In this study, principle component bands and normalized difference vegetation index(NDVI) was utilized in the supervised classification for the land cover classification. To utilize additional information not included in the principle component bands by the hyperspectral imagery, we tried to increase the classification accuracy by using the NDVI. In addition, the extended attribute profiles(EAP) generated using the morphological filter was used as the input data. The random forest algorithm, which is one of the representative supervised classification, was used. The classification accuracy according to the application of various features based on EAP was compared. Two areas was selected in the experiments, and the quantitative evaluation was performed by using reference data. The classification accuracy of the proposed algorithm showed the highest classification accuracy of 85.72% and 91.14% compared with existing algorithms. Further research will need to develop a supervised classification algorithm and additional input datasets to improve the accuracy of land cover classification using hyperspectral imagery.

Bike Insurance Fraud Detection Model Using Balanced Randomforest Algorithm (균형 랜덤 포레스트를 이용한 이륜차 보험사기 적발 모형 개발)

  • Kim, Seunghoon;Lee, Soo Il;Kim, Tae ho
    • Journal of Digital Convergence
    • /
    • v.20 no.2
    • /
    • pp.241-250
    • /
    • 2022
  • Due to the COVID-19 pandemic, with increased 'untact' services and with unstable household economy, the bike insurance fraud is expected to surge. Moreover, the fraud methodology gets complicated. However, the fraud detection model for bike insurance is absent. we deal with the issue of skewed class distribution and reflect the criterion of fraud detection expert. We utilize a balanced random-forest algorithm to develop an efficient bike insurance fraud detection model. As a result, while the predictive performance of balanced random-forest model is superior than it of non-balanced model. There is no significant difference between the variables used by the experts and the confirmatory models. The important variables to detect frauds are turned out to be age and gender of driver, correspondence between insured and driver, the amount of self-repairing claim, and the amount of bodily injury liability.

Development of Machine Learning Based Precipitation Imputation Method (머신러닝 기반의 강우추정 방법 개발)

  • Heechan Han;Changju Kim;Donghyun Kim
    • Journal of Wetlands Research
    • /
    • v.25 no.3
    • /
    • pp.167-175
    • /
    • 2023
  • Precipitation data is one of the essential input datasets used in various fields such as wetland management, hydrological simulation, and water resource management. In order to efficiently manage water resources using precipitation data, it is essential to secure as much data as possible by minimizing the missing rate of data. In addition, more efficient hydrological simulation is possible if precipitation data for ungauged areas are secured. However, missing precipitation data have been estimated mainly by statistical equations. The purpose of this study is to propose a new method to restore missing precipitation data using machine learning algorithms that can predict new data based on correlations between data. Moreover, compared to existing statistical methods, the applicability of machine learning techniques for restoring missing precipitation data is evaluated. Representative machine learning algorithms, Artificial Neural Network (ANN) and Random Forest (RF), were applied. For the performance of classifying the occurrence of precipitation, the RF algorithm has higher accuracy in classifying the occurrence of precipitation than the ANN algorithm. The F1-score and Accuracy values, which are evaluation indicators of the classification model, were calculated as 0.80 and 0.77, while the ANN was calculated as 0.76 and 0.71. In addition, the performance of estimating precipitation also showed higher accuracy in RF than in ANN algorithm. The RMSE of the RF and ANN algorithms was 2.8 mm/day and 2.9 mm/day, and the values were calculated as 0.68 and 0.73.

The Development of Major Tree Species Classification Model using Different Satellite Images and Machine Learning in Gwangneung Area (이종센서 위성영상과 머신 러닝을 활용한 광릉지역 주요 수종 분류 모델 개발)

  • Lim, Joongbin;Kim, Kyoung-Min;Kim, Myung-Kil
    • Korean Journal of Remote Sensing
    • /
    • v.35 no.6_2
    • /
    • pp.1037-1052
    • /
    • 2019
  • We had developed in preceding study a classification model for the Korean pine and Larch with an accuracy of 98 percent using Hyperion and Sentinel-2 satellite images, texture information, and geometric information as the first step for tree species mapping in the inaccessible North Korea. Considering a share of major tree species in North Korea, the classification model needs to be expanded as it has a large share of Oak(29.5%), Pine (12.7%), Fir (8.2%), and as well as Larch (17.5%) and Korean pine (5.8%). In order to classify 5 major tree species, national forest type map of South Korea was used to build 11,039 training and 2,330 validation data. Sentinel-2 data was used to derive spectral information, and PlanetScope data was used to generate texture information. Geometric information was built from SRTM DEM data. As a machine learning algorithm, Random forest was used. As a result, the overall accuracy of classification was 80% with 0.80 kappa statistics. Based on the training data and the classification model constructed through this study, we will extend the application to Mt. Baekdu and North and South Goseong areas to confirm the applicability of tree species classification on the Korean Peninsula.

POSE-VIWEPOINT ADAPTIVE OBJECT TRACKING VIA ONLINE LEARNING APPROACH

  • Mariappan, Vinayagam;Kim, Hyung-O;Lee, Minwoo;Cho, Juphil;Cha, Jaesang
    • International journal of advanced smart convergence
    • /
    • v.4 no.2
    • /
    • pp.20-28
    • /
    • 2015
  • In this paper, we propose an effective tracking algorithm with an appearance model based on features extracted from a video frame with posture variation and camera view point adaptation by employing the non-adaptive random projections that preserve the structure of the image feature space of objects. The existing online tracking algorithms update models with features from recent video frames and the numerous issues remain to be addressed despite on the improvement in tracking. The data-dependent adaptive appearance models often encounter the drift problems because the online algorithms does not get the required amount of data for online learning. So, we propose an effective tracking algorithm with an appearance model based on features extracted from a video frame.

Use of a Machine Learning Algorithm to Predict Individuals with Suicide Ideation in the General Population

  • Ryu, Seunghyong;Lee, Hyeongrae;Lee, Dong-Kyun;Park, Kyeongwoo
    • Psychiatry investigation
    • /
    • v.15 no.11
    • /
    • pp.1030-1036
    • /
    • 2018
  • Objective In this study, we aimed to develop a model predicting individuals with suicide ideation within a general population using a machine learning algorithm. Methods Among 35,116 individuals aged over 19 years from the Korea National Health & Nutrition Examination Survey, we selected 11,628 individuals via random down-sampling. This included 5,814 suicide ideators and the same number of non-suicide ideators. We randomly assigned the subjects to a training set (n=10,466) and a test set (n=1,162). In the training set, a random forest model was trained with 15 features selected with recursive feature elimination via 10-fold cross validation. Subsequently, the fitted model was used to predict suicide ideators in the test set and among the total of 35,116 subjects. All analyses were conducted in R. Results The prediction model achieved a good performance [area under receiver operating characteristic curve (AUC)=0.85] in the test set and predicted suicide ideators among the total samples with an accuracy of 0.821, sensitivity of 0.836, and specificity of 0.807. Conclusion This study shows the possibility that a machine learning approach can enable screening for suicide risk in the general population. Further work is warranted to increase the accuracy of prediction.

Double-Bagging Ensemble Using WAVE

  • Kim, Ahhyoun;Kim, Minji;Kim, Hyunjoong
    • Communications for Statistical Applications and Methods
    • /
    • v.21 no.5
    • /
    • pp.411-422
    • /
    • 2014
  • A classification ensemble method aggregates different classifiers obtained from training data to classify new data points. Voting algorithms are typical tools to summarize the outputs of each classifier in an ensemble. WAVE, proposed by Kim et al. (2011), is a new weight-adjusted voting algorithm for ensembles of classifiers with an optimal weight vector. In this study, when constructing an ensemble, we applied the WAVE algorithm on the double-bagging method (Hothorn and Lausen, 2003) to observe if any significant improvement can be achieved on performance. The results showed that double-bagging using WAVE algorithm performs better than other ensemble methods that employ plurality voting. In addition, double-bagging with WAVE algorithm is comparable with the random forest ensemble method when the ensemble size is large.

Facebook Spam Post Filtering based on Instagram-based Transfer Learning and Meta Information of Posts (인스타그램 기반의 전이학습과 게시글 메타 정보를 활용한 페이스북 스팸 게시글 판별)

  • Kim, Junhong;Seo, Deokseong;Kim, Haedong;Kang, Pilsung
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.43 no.3
    • /
    • pp.192-202
    • /
    • 2017
  • This study develops a text spam filtering system for Facebook based on two variable categories: keywords learned from Instagram and meta-information of Facebook posts. Since there is no explicit labels for spam/ham posts, we utilize hash tags in Instagram to train classification models. In addition, the filtering accuracy is enhanced by considering meta-information of Facebook posts. To verify the proposed filtering system, we conduct an empirical experiment based on a total of 1,795,067 and 761,861 Facebook and Instagram documents, respectively. Employing random forest as a base classification algorithm, experimental result shows that the proposed filtering system yield 99% and 98% in terms of filtering accuracy and F1-measure, respectively. We expect that the proposed filtering scheme can be applied other web services suffering from massive spam posts but no explicit spam labels are available.