Browse > Article
http://dx.doi.org/10.36498/kbigdt.2020.5.2.53

Improving Efficiency of Food Hygiene Surveillance System by Using Machine Learning-Based Approaches  

Cho, Sanggoo (식품안전정보원)
Cho, Seung Yong (식품안전정보원)
Publication Information
The Journal of Bigdata / v.5, no.2, 2020 , pp. 53-67 More about this Journal
Abstract
This study employees a supervised learning prediction model to detect nonconformity in advance of processed food manufacturing and processing businesses. The study was conducted according to the standard procedure of machine learning, such as definition of objective function, data preprocessing and feature engineering and model selection and evaluation. The dependent variable was set as the number of supervised inspection detections over the past five years from 2014 to 2018, and the objective function was to maximize the probability of detecting the nonconforming companies. The data was preprocessed by reflecting not only basic attributes such as revenues, operating duration, number of employees, but also the inspections track records and extraneous climate data. After applying the feature variable extraction method, the machine learning algorithm was applied to the data by deriving the company's risk, item risk, environmental risk, and past violation history as feature variables that affect the determination of nonconformity. The f1-score of the decision tree, one of ensemble models, was much higher than those of other models. Based on the results of this study, it is expected that the official food control for food safety management will be enhanced and geared into the data-evidence based management as well as scientific administrative system.
Keywords
Preprocessed data; Feature engineering; Scientific and evidence based approach; Supervised machine learning;
Citations & Related Records
Times Cited By KSCI : 3  (Citation Analysis)
연도 인용수 순위
1 Kim, U.M. and T.H. Hong, "The Prediction of Customers based on Case Based Reasoning with Weighted Factors for imbalanced Data Sets", The Journal of Information Systems, Vo1.21, No.1(2014), 29~45.
2 Lee, J.S. and J.G. Kwon, "A Hybrid Svm Classifier for Imbalanced Data Sets", Journal of Intelligence and Information Systems. Vol.19, No.2(2013), 125~40.   DOI
3 Kang, P.S., H.J. Lee and S.Z. Cho, "Svm Ensemble Techniques for Class Imbalance Problem", KOREA INFORMATION SCIENCE SOCIETY, Vol.31, No.2(2004), 706~708.
4 Tamilarasi, P. and Rani, R.Uma, "Diagnosis of Crime Rate against Women using k-fold Cross Validation through Machine Learning", 2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC) Computing Methodologies and Communication (ICCMC), 2020 Fourth International Conference on. :1034-1038 Mar, 2020.
5 Marvin, H. J. P., Janssen, E. M., Bouzembrak, Y., Hendriksen, P. J. M., & Staats, M. (2017). Big data in food safety: An overview. Critical Reviews in Food Science and Nutrition, 57 (11), 2286-2295.   DOI
6 Wu, X., V. Kumar, M. Steinbach, Q. J. Ross, J. Ghosh, Q. Yang, H. Motoda, G. J. McLachlan, A. Ng, B. Liu, P. S. Yu, Z. H. Zhou, D. J. Hand, and D. Steinberg, "Top 10 Algorithms in Data Mining", Knowledge and Information Systems, Vol.14, No.1(2008), 1~37.   DOI
7 Singh, Durgesh Kumar and Goel, Noopur, "Analysing Data Mining Techniques on Bank Customers for Credit Score", 2020 8th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO), 2020 8th International Conference on. :1293-1295 Jun, 2020.
8 Cho, S.G. and K,H. Choi, "Study on Anomaly Detection Method of Improper Foods using Import Food Big Data", The Journal of Big Data, Vol.3, No.2(2018), 19~33.
9 Barandela, R., V. Garc, E. Rangel, and J. S. Sanchez, "Strategies for Learning in Class Imbalance Problems", Pattern Recognition, Vol.36, No.3(2003), 849~865.   DOI
10 Cao, Fuyuan, Jiye Liang, Deyu Li, and Xingwang Zhao, "A Weighting K-Modes Algorithm for Subspace Clustering of Categorical Data", Neurocomputing. Vol.108(2013), 23~30.   DOI
11 Ganganwar, Vaishali, "An Overview of Classification Algorithms for Imbalanced Datasets", International Journal of Emerging Technology and Advanced Engineering, Vol.2, No.4(2012), 42~47.
12 Guolin Ke. LightGBM: A Highly Efficient GradientBoosting Decision Tree. Data-science. 2018.06.
13 Jin, Huang, and C. X. Ling, "Using Auc and Accuracy in Evaluating Learning Algorithms", IEEE Transactions on Knowledge and Data Engineering, Knowledge and Data Engineering, Vol. 17, No.3(2005), 299~310.   DOI
14 Kang, P.S., H.J. Lee and S.Z. Cho, "Svm Ensemble Techniques for Class Imbalance Problem", KOREA INFORMATION SCIENCE SOCIETY, Vol.31, No.2(2004), 706~708.