Browse > Article
http://dx.doi.org/10.3745/KTSDE.2021.10.9.375

Deep Learning Model Validation Method Based on Image Data Feature Coverage  

Lim, Chang-Nam (아주대학교 AI융합네트워크학과)
Park, Ye-Seul (아주대학교 AI융합네트워크학과)
Lee, Jung-Won (아주대학교 전자공학과/AI융합네트워크학과)
Publication Information
KIPS Transactions on Software and Data Engineering / v.10, no.9, 2021 , pp. 375-384 More about this Journal
Abstract
Deep learning techniques have been proven to have high performance in image processing and are applied in various fields. The most widely used methods for validating a deep learning model include a holdout verification method, a k-fold cross verification method, and a bootstrap method. These legacy methods consider the balance of the ratio between classes in the process of dividing the data set, but do not consider the ratio of various features that exist within the same class. If these features are not considered, verification results may be biased toward some features. Therefore, we propose a deep learning model validation method based on data feature coverage for image classification by improving the legacy methods. The proposed technique proposes a data feature coverage that can be measured numerically how much the training data set for training and validation of the deep learning model and the evaluation data set reflects the features of the entire data set. In this method, the data set can be divided by ensuring coverage to include all features of the entire data set, and the evaluation result of the model can be analyzed in units of feature clusters. As a result, by providing feature cluster information for the evaluation result of the trained model, feature information of data that affects the trained model can be provided.
Keywords
Deep Learning; Coverage Testing; Image Feature Extraction; Validation Method; Dataset Splitting Method;
Citations & Related Records
연도 인용수 순위
  • Reference
1 H. Xiao, K. Rasul, and R. Vollgraf, "Fashion-mnist: A novel image dataset for benchmarking machine learning algorithms," arXiv preprint arXiv:1708.07747, 2017.
2 A. Krizhevsky and G. Hinton, "Learning multiple layers of features from tiny images," p.7, 2009.
3 Y. Xu and R. Goodacre, "On splitting training and validation set: A comparative study of cross-validation, bootstrap and systematic sampling for estimating the generalization performance of supervised learning," Journal of Analysis and Testing, Vol.2, No.3, pp.249-262, 2018.   DOI
4 M. Ozuysal, M. Calonder, V. Lepetit, and P. Fua, "Fast keypoint recognition using random ferns," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.32, No.3, pp.448-461, 2009.   DOI
5 A. Angelova, A. Krizhevsky, V. Vanhoucke, A. Ogale, and D. Ferguson, "Real-time pedestrian detection with deep network cascades," 2015.
6 R. Kohavi, "A study of cross-validation and bootstrap for accuracy estimation and model selection," Ijcai, Vol.14. No.2, pp.1137-1145, 1995.
7 A. K. Jain and A. Vailaya, "Image retrieval using color and shape," Pattern Recognition, Vol.29, No.8, pp.1233-1244, 1996.   DOI
8 G. Pass, and R. Zabih, "Histogram refinement for content-based image retrieval," Proceedings Third IEEE Workshop on Applications of Computer Vision, WACV'96. IEEE, 1996.
9 L. Rokach and O. Maimon, "Clustering methods," Data Mining and Knowledge Discovery Handbook. Springer, Boston, MA, pp.321-352, 2005.
10 R. Xu and D. Wunsch, "Survey of clustering algorithms," IEEE Transactions on Neural Networks, Vol.16, No.3, pp.645-678, 2005.   DOI
11 R. Bro and A. K. Smilde, "Principal component analysis," Analytical Methods, Vol.6, No.9, pp.2812-2831, 2014.   DOI
12 B. Froba and A. Ernst, "Face detection with the modified census transform," Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings, IEEE, 2004.
13 S. Mani, A. Sankaran, S. Tamilselvam, and A. Sethi, "Coverage testing of deep learning models using dataset characterization," arXiv preprint arXiv:1911.07309, 2019.
14 A. Sharif Razavian, H. Azizpour, J. Sullivan, and S. Carlsson, "CNN features off-the-shelf: An astounding baseline for recognition," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2014.
15 N. Dalal and B. Triggs, "Histograms of oriented gradients for human detection," 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05). Vol.1. IEEE, 2005.
16 R. Lienhart and J. Maydt, "An extended set of haar-like features for rapid object detection," Proceedings. International Conference on Image Processing, Vol.1. IEEE, 2002.
17 P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun, "Overfeat: Integrated recognition, localization and detection using convolutional networks," arXiv preprint arXiv:1312.6229, 2013.
18 M. Flickner, et al, "Query by image and video content: The QBIC system," Computer, Vol.28, No.9, pp.23-32, 1995.   DOI
19 F. Milletari, N. Navab, and S. A. Ahmadi, "V-net: Fully convolutional neural networks for volumetric medical image segmentation," 2016 Fourth International Conference on 3D Vision (3DV). IEEE, 2016.
20 A. Ferdowsi and W. Saad, "Deep learning for signal authentication and security in massive internet-of-things systems," IEEE Transactions on Communications, Vol.67, No.2, pp.1371-1387, 2018.   DOI
21 L. Liu, C. Shen, and A. Van Den Hengel, "The treasure beneath convolutional layers: Cross-convolutional-layer pooling for image classification," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015.
22 T. Lindeberg, "Scale invariant feature transform," pp.10491, 2012.
23 S. Yadav and S. Shukla, "Analysis of k-fold cross-validation over hold-out validation on colossal datasets for quality classification," 2016 IEEE 6th International Conference on Advanced Computing (IACC), IEEE, 2016.
24 D. ping Tian, "A review on image feature extraction and representation techniques," International Journal of Multimedia and Ubiquitous Engineering, Vol.8, No.4, pp.385-396, 2013.
25 J. Huang, S.R. Kumar, M. Mitra, W. Zhu, and R. Zabih, "Image indexing using color correlograms," Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, 1997.
26 G. Chandrashekar and F. Sahin, "A survey on feature selection methods," Computers & Electrical Engineering, Vol.40, No.1, pp.16-28, 2014.   DOI
27 M. Bojarski, et al., "End to end learning for self-driving cars," arXiv preprint arXiv:1604.07316, 2016.
28 T. Ojala, M. Pietikainen, and D. Harwood, "A comparative study of texture measures with classification based on featured distributions," Pattern Recognition, Vol.29, No.1, pp.51-59, 1996.   DOI
29 A. Esteva, et al, "Dermatologist-level classification of skin cancer with deep neural networks," Nature, Vol.542, No.7639, pp.115-118, 2017.   DOI
30 S. Na, L. Xumin, and G. Yong, "Research on k-means clustering algorithm: An improved k-means clustering algorithm," 2010 Third International Symposium on intelligent Information Technology and Security Informatics, IEEE, 2010.
31 T. M. Kodinariya and P. R. Makwana, "Review on determining number of Cluster in K-Means Clustering," International Journal, Vol.1, No.6, pp.90-95, 2013.
32 J. Xue, C. Lee, S. G.Wakeham, and R. A. Armstronga, "Using principal components analysis (PCA) with cluster analysis to study the organic geochemistry of sinking particles in the ocean," Organic Geochemistry, Vol.42, No.4, pp.356-367, 2011.   DOI