• Title/Summary/Keyword: Gaussian Naive Bayes classifier

Search Result 3, Processing Time 0.016 seconds

Android Malware Detection Using Permission-Based Machine Learning Approach (머신러닝을 이용한 권한 기반 안드로이드 악성코드 탐지)

  • Kang, Seongeun;Long, Nguyen Vu;Jung, Souhwan
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.28 no.3
    • /
    • pp.617-623
    • /
    • 2018
  • This study focuses on detection of malicious code through AndroidManifest permissoion feature extracted based on Android static analysis. Features are built on the permissions of AndroidManifest, which can save resources and time for analysis. Malicious app detection model consisted of SVM (support vector machine), NB (Naive Bayes), Gradient Boosting Classifier (GBC) and Logistic Regression model which learned 1,500 normal apps and 500 malicious apps and 98% detection rate. In addition, malicious app family identification is implemented by multi-classifiers model using algorithm SVM, GPC (Gaussian Process Classifier) and GBC (Gradient Boosting Classifier). The learned family identification machine learning model identified 92% of malicious app families.

Study on Anomaly Detection Method of Improper Foods using Import Food Big data (수입식품 빅데이터를 이용한 부적합식품 탐지 시스템에 관한 연구)

  • Cho, Sanggoo;Choi, Gyunghyun
    • The Journal of Bigdata
    • /
    • v.3 no.2
    • /
    • pp.19-33
    • /
    • 2018
  • Owing to the increase of FTA, food trade, and versatile preferences of consumers, food import has increased at tremendous rate every year. While the inspection check of imported food accounts for about 20% of the total food import, the budget and manpower necessary for the government's import inspection control is reaching its limit. The sudden import food accidents can cause enormous social and economic losses. Therefore, predictive system to forecast the compliance of food import with its preemptive measures will greatly improve the efficiency and effectiveness of import safety control management. There has already been a huge data accumulated from the past. The processed foods account for 75% of the total food import in the import food sector. The analysis of big data and the application of analytical techniques are also used to extract meaningful information from a large amount of data. Unfortunately, not many studies have been done regarding analyzing the import food and its implication with understanding the big data of food import. In this context, this study applied a variety of classification algorithms in the field of machine learning and suggested a data preprocessing method through the generation of new derivative variables to improve the accuracy of the model. In addition, the present study compared the performance of the predictive classification algorithms with the general base classifier. The Gaussian Naïve Bayes prediction model among various base classifiers showed the best performance to detect and predict the nonconformity of imported food. In the future, it is expected that the application of the abnormality detection model using the Gaussian Naïve Bayes. The predictive model will reduce the burdens of the inspection of import food and increase the non-conformity rate, which will have a great effect on the efficiency of the food import safety control and the speed of import customs clearance.

Identification of Pb-Zn ore under the condition of low count rate detection of slim hole based on PGNAA technology

  • Haolong Huang;Pingkun Cai;Wenbao Jia;Yan Zhang
    • Nuclear Engineering and Technology
    • /
    • v.55 no.5
    • /
    • pp.1708-1717
    • /
    • 2023
  • The grade analysis of lead-zinc ore is the basis for the optimal development and utilization of deposits. In this study, a method combining Prompt Gamma Neutron Activation Analysis (PGNAA) technology and machine learning is proposed for lead-zinc mine borehole logging, which can identify lead-zinc ores of different grades and gangue in the formation, providing real-time grade information qualitatively and semi-quantitatively. Firstly, Monte Carlo simulation is used to obtain a gamma-ray spectrum data set for training and testing machine learning classification algorithms. These spectra are broadened, normalized and separated into inelastic scattering and capture spectra, and then used to fit different classifier models. When the comprehensive grade boundary of high- and low-grade ores is set to 5%, the evaluation metrics calculated by the 5-fold cross-validation show that the SVM (Support Vector Machine), KNN (K-Nearest Neighbor), GNB (Gaussian Naive Bayes) and RF (Random Forest) models can effectively distinguish lead-zinc ore from gangue. At the same time, the GNB model has achieved the optimal accuracy of 91.45% when identifying high- and low-grade ores, and the F1 score for both types of ores is greater than 0.9.