DOI QR코드

DOI QR Code

An Efficient One Class Classifier Using Gaussian-based Hyper-Rectangle Generation

가우시안 기반 Hyper-Rectangle 생성을 이용한 효율적 단일 분류기

  • Kim, Do Gyun (Department of Industrial Engineering, Ajou University) ;
  • Choi, Jin Young (Department of Industrial Engineering, Ajou University) ;
  • Ko, Jeonghan (Department of Industrial Engineering, Ajou University)
  • Received : 2018.02.28
  • Accepted : 2018.06.07
  • Published : 2018.06.30

Abstract

In recent years, imbalanced data is one of the most important and frequent issue for quality control in industrial field. As an example, defect rate has been drastically reduced thanks to highly developed technology and quality management, so that only few defective data can be obtained from production process. Therefore, quality classification should be performed under the condition that one class (defective dataset) is even smaller than the other class (good dataset). However, traditional multi-class classification methods are not appropriate to deal with such an imbalanced dataset, since they classify data from the difference between one class and the others that can hardly be found in imbalanced datasets. Thus, one-class classification that thoroughly learns patterns of target class is more suitable for imbalanced dataset since it only focuses on data in a target class. So far, several one-class classification methods such as one-class support vector machine, neural network and decision tree there have been suggested. One-class support vector machine and neural network can guarantee good classification rate, and decision tree can provide a set of rules that can be clearly interpreted. However, the classifiers obtained from the former two methods consist of complex mathematical functions and cannot be easily understood by users. In case of decision tree, the criterion for rule generation is ambiguous. Therefore, as an alternative, a new one-class classifier using hyper-rectangles was proposed, which performs precise classification compared to other methods and generates rules clearly understood by users as well. In this paper, we suggest an approach for improving the limitations of those previous one-class classification algorithms. Specifically, the suggested approach produces more improved one-class classifier using hyper-rectangles generated by using Gaussian function. The performance of the suggested algorithm is verified by a numerical experiment, which uses several datasets in UCI machine learning repository.

Keywords

References

  1. Asuncion, A. and Newman, D., UCI machine learning repository, http://www.ics.uci.edu/-mlearn/MLRepository.html.
  2. Baehrens, D., Schroeter, T., Harmeling, S., Kawanabe, M., Hansen, K., and Muller, K.R., How to explain individual classification decisions, The Journal of Machine Learning Research, 2010, Vol. 11, pp. 1803-1831.
  3. Barakat, N. and Bradley, A.P., Rule extraction from support vector machines : a review, Neurocomputing, 2010, Vol. 74, No. 1-3, pp. 178-190. https://doi.org/10.1016/j.neucom.2010.02.016
  4. Cortes, C. and Vapnik, V., Support-vector networks, Machine Learning, 1995, Vol. 20, No. 3, pp. 273-297. https://doi.org/10.1007/BF00994018
  5. De Comite, F., Denis, F., Gilleron, R., and Letouzey, F., Positive and unlabeled examples help learning, Proceedings of International Conference on Algorithmic Learning Theory, 1999, Berlin, Germany, pp. 219-230.
  6. De Ridder, D., Tax, D., and Duin, R.P., An experimental comparison of one-class classification methods, the 4th Annual Conference of the Advanced School for Computing and Imaging, 1998, Delft, Netherlands.
  7. Desir, C., Bernard, S., Petitjean, C., and Heutte, L., A random forest based approach for one class classification in medical imaging, Machine Learning in Medical Imaging, Lecture Notes in Computer Science, 2012, Vol. 7588, pp. 250-257.
  8. Hempstalk, K., Frank, E., and Witten, I.H., One-class classification by combining density and class probability estimation, Joint European Conference on Machine Learning and Knowledge Discovery in Databases, 2008, Berlin, Germany, pp. 505-519.
  9. Jeong, I.K. and Choi, J.Y., Design of One-Class Classifier Using Hyper-Rectangles, Journal of the Korean Institute of Industrial Engineers, 2015, Vol. 41, No. 5, pp. 439-446. https://doi.org/10.7232/JKIIE.2015.41.5.439
  10. Juszczak, P., Tax, D.M., Pe, E., and Duin, R.P., Minimum spanning tree based one-class classifier, Neurocomputing, 2009, Vol. 72, No. 7-9, pp. 1859-1869. https://doi.org/10.1016/j.neucom.2008.05.003
  11. Kang, B.S. and Kim, S.S., Combined Artificial Bee Colony for Data Clustering, Journal of Society of Korea Industrial and Systems Engineering, 2017, Vol. 40, No. 4, pp. 203-210. https://doi.org/10.11627/jkise.2017.40.4.203
  12. Letouzey, F., Denis, F., and Gilleron, R., Learning from positive and unlabeled examples, Proceedings of 10th International Conference on Algorithmic Learning Theory, Berlin, German, 2000, pp. 71-85.
  13. Park, Y.J., Kim, G.Y., and Jang, S.W., Traffic Anomaly Identification Using Multi-Class Support Vector Machine, Journal of the Korea Academia-Industrial Cooperation Society, 2013, Vol. 14, No. 4, pp. 1942-1950. https://doi.org/10.5762/KAIS.2013.14.4.1942
  14. Scholkopf, B., Williamson, R., Smola, A., Taylor, J.S., and Platt, J., Support vector method for novelty detection, Advances in Neural Information Processing Systems, 2000, Vol. 12, pp. 582-588.
  15. Tarassenko, L., Hayton, P., Cerneaz, N., and Brady, M., Novelty detection for the identification of masses in mammograms, 4th International Conference on Artificial Neural Networks, 1995, pp. 442-447.
  16. Tax, D.M.J. and Duin, R.P.W., Data domain description using support vectors, Proceedings of European Symposium on Artificial Neural Networks, 1999a, Brussels, Belgium, pp. 251-256.
  17. Tax, D.M.J. and Duin, R.P.W., Support vector domain description, Pattern Recognition Letters, 1999b, Vol. 20, pp. 1191-1199. https://doi.org/10.1016/S0167-8655(99)00087-2
  18. Tax, D.M.J., One-class Classification, [dissertation], [Delft, Netherlands] : Delft University of Technology, 2001.

Cited by

  1. 빅데이터를 위한 H-RTGL 기반 단일 분류기 분산 처리 프레임워크 설계 vol.48, pp.4, 2020, https://doi.org/10.7469/jksqm.2020.48.4.553