Browse > Article
http://dx.doi.org/10.3745/JIPS.02.0169

A Manually Captured and Modified Phone Screen Image Dataset for Widget Classification on CNNs  

Byun, SungChul (Dept. of Electrical Engineering, Korea University)
Han, Seong-Soo (Dept. of Division of Liberal Studies, Kangwon National University)
Jeong, Chang-Sung (Dept. of Electrical Engineering, Korea University)
Publication Information
Journal of Information Processing Systems / v.18, no.2, 2022 , pp. 197-207 More about this Journal
Abstract
The applications and user interfaces (UIs) of smart mobile devices are constantly diversifying. For example, deep learning can be an innovative solution to classify widgets in screen images for increasing convenience. To this end, the present research leverages captured images and the ReDraw dataset to write deep learning datasets for image classification purposes. First, as the validation for datasets using ResNet50 and EfficientNet, the experiments show that the dataset composed in this study is helpful for classification according to a widget's functionality. An implementation for widget detection and classification on RetinaNet and EfficientNet is then executed. Finally, the research suggests the Widg-C and Widg-D datasets-a deep learning dataset for identifying the widgets of smart devices-and implementing them for use with representative convolutional neural network models.
Keywords
Captured Image; CNN; Deep Learning Dataset; Image Classification; Object Detection; Widget;
Citations & Related Records
Times Cited By KSCI : 3  (Citation Analysis)
연도 인용수 순위
1 T. Akram, H. M. J. Lodhi, S. R. Naqvi, S. Naeem, M. Alhaisoni, M. Ali, S. A. Haider, and N. N. Qadri, "A multilevel features selection framework for skin lesion classification," Human-centric Computing and Information Sciences, vol. 10, article no. 12, 2020. https://doi.org/10.1186/s13673-020-00216-y   DOI
2 J. Lee and K. I. Hwang, "RAVIP: real-time AI vision platform for heterogeneous multi-channel video stream," Journal of Information Processing Systems, vol. 17, no. 2, pp. 227-241, 2021.   DOI
3 S. Shokat, R. Riaz, S. S. Rizvi, A. M. Abbasi, A. A. Abbasi, and S. J. Kwon, "Deep learning scheme for character prediction with position-free touch screen-based Braille input method," Human-centric Computing and Information Sciences, vol. 10, article no. 41, 2020. https://doi.org/10.1186/s13673-020-00246-6   DOI
4 S. D. You, C. H. Liu, and W. K. Chen, W. K. (2018). Comparative study of singing voice detection based on deep neural networks and ensemble learning. Human-centric Computing and Information Sciences, vol. 8, article no. 34, 2018. https://doi.org/10.1186/s13673-018-0158-1   DOI
5 K. Moran, C. Bernal-Cardenas, M. Curcio, R. Bonett, and D. Poshyvanyk, "Machine learning-based prototyping of graphical user interfaces for mobile apps," IEEE Transactions on Software Engineering, vol. 46, no. 2, pp. 196-221, 2018.   DOI
6 M. Tan and Q. Le, "Efficientnet: rethinking model scaling for convolutional neural networks," Proceedings of Machine Learning Research, vol. 97, pp. 6105-6114, 2019.
7 K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, 2016, pp. 770-778.
8 H. Han, "Residual learning based CNN for gesture recognition in robot interaction," Journal of Information Processing Systems, vol. 17, no. 2, pp. 385-398, 2021.   DOI
9 Y. Guan, M. Aamir, Z. Hu, W. A. Abro, Z. Rahman, Z. A. Dayo, and S. Akram, "A region-based efficient network for accurate object detection," Traitement du Signal, vol. 38, no. 2, pp. 481-494, 2021.   DOI
10 M. Everingham, L. Van Gool, C. K. Williams, J. Winn, and A. Zisserman, "The pascal visual object classes (VOC) challenge," International Journal of Computer Vision, vol. 88, no. 2, pp. 303-338, 2010.   DOI
11 M. Aamir, Y. F. Pu, Z. Rahman, W. A. Abro, H. Naeem, F. Ullah, and A. M. Badr, "A hybrid proposed framework for object detection and classification," Journal of Information Processing Systems, vol. 14, no. 5, pp. 1176-1194, 2018.   DOI
12 T. Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollar, "Focal loss for dense object detection," in Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 2017, pp. 2999-3007.
13 BoundingBoxerImg [Online]. Available: https://github.com/jms0923/BoundingBoxerImg.
14 M. Aamir, Y. F. Pu, W. A. Abro, H. Naeem, and Z. Rahman, "A hybrid approach for object proposal generation," in The Proceedings of the International Conference on Sensing and Imaging. Cham, Switzerland: Springer, 2017, pp. 251-259.
15 G. Hinton, N. Srivastava, and K. Swersky, "Neural Networks for Machine Learning: overview of minibatch gradient descent (Lecture 6a)," [Online]. Available: http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf.
16 D. P. Kingma and J. Ba, "Adam: a method for stochastic optimization," 2014 [Online]. Available: https://arxiv.org/abs/1412.6980.
17 D. Cao, Z. Chen, and L. Gao, L. (2020). An improved object detection algorithm based on multi-scaled and deformable convolutional neural networks. Human-centric Computing and Information Sciences, vol. 10, article no. 14, 2020. https://doi.org/10.1186/s13673-020-00219-9   DOI