DOI QR코드

DOI QR Code

Design of Image Generation System for DCGAN-Based Kids' Book Text

  • Cho, Jaehyeon (Division of Computer Engineering, Hoseo University) ;
  • Moon, Nammee (Division of Computer and Information Engineering, Hoseo University)
  • 투고 : 2019.02.12
  • 심사 : 2020.02.24
  • 발행 : 2020.12.31

초록

For the last few years, smart devices have begun to occupy an essential place in the life of children, by allowing them to access a variety of language activities and books. Various studies are being conducted on using smart devices for education. Our study extracts images and texts from kids' book with smart devices and matches the extracted images and texts to create new images that are not represented in these books. The proposed system will enable the use of smart devices as educational media for children. A deep convolutional generative adversarial network (DCGAN) is used for generating a new image. Three steps are involved in training DCGAN. Firstly, images with 11 titles and 1,164 images on ImageNet are learned. Secondly, Tesseract, an optical character recognition engine, is used to extract images and text from kids' book and classify the text using a morpheme analyzer. Thirdly, the classified word class is matched with the latent vector of the image. The learned DCGAN creates an image associated with the text.

키워드

참고문헌

  1. I. T. Kim and K. J. Yoo, "Effects of augmented reality picture book on the language expression and flow of young children's in picture book reading activities," The Journal of Korea Open Association for Early Childhood Education, vol. 23, no. 1, pp. 83-109, 2018. https://doi.org/10.20437/koaece23-1-04
  2. K. M. Ryu, H. J. Kim, H. J. Kim, E. J. Lee, and J. Y. Heo, "A development of interactive storybook with digital board and smart device," in Proceedings of the HCI Society of Korea, Pyeongchang, Korea, 2017, pp. 1179-1182.
  3. Y. Kim and H. Park, "Study on the relation between young children's smart device immersion tendency and their playfulness," Early Childhood Education Research & Review, vol. 20, no. 4, pp. 337-353, 2016.
  4. I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, "Generative adversarial nets," Advances in Neural Information Processing Systems, vol. 27, pp. 2672-2680, 2014.
  5. R. Tachibana, T. Matsubara, and K. Uehara, "Semi-supervised learning using adversarial networks," in Proceedings of 2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS), Okayama, Japan, 2016, pp. 1-6.
  6. M. S. Ko, H. K. Roh, and K. H. Lee, "GANMOOK: generative adversarial network to stylize images like ink wash painting," in Proceedings of the Korea Computer Congress, 2017, pp. 793-795.
  7. L. C. Yang, S. Y. Chou, and Y. H. Yang, "MidiNet: a convolutional generative adversarial network for symbolic-domain music generation," in Proceedings of the 18th International Society of Music Information Retrieval Conference, Suzhou, China, 2017, pp. 324-331.
  8. G. C. Lee and J. Yoo, "Development an Android based OCR application for Hangul food menu," Journal of the Korea Institute of Information and Communication Engineering, vol. 21, no. 5, pp. 951-959, 2017. https://doi.org/10.6109/jkiice.2017.21.5.951
  9. R. Smith, "An overview of the Tesseract OCR engine," in Proceedings of the 9th International Conference on Document Analysis and Recognition (ICDAR), 2007, Parana, Brazil, pp, 629-633.
  10. A. C. Rodriguez, T. Kacprzak, A. Lucchi, A. Amara, R. Sgier, J. Fluri, T. Hofmann, and A. Refregier, "Fast cosmic web simulations with generative adversarial networks," Computational Astrophysics and Cosmology, vol. 5, article no. 4, 2018. https://doi.org/10.1186/s40668-018-0027-3
  11. R. Yamashita, M. Nishio, R. K. G. Do, and K. Togashi, "Convolutional neural networks: an overview and application in radiology," Insights into Imaging, vol. 9, no. 4, pp. 611-629, 2018. https://doi.org/10.1007/s13244-018-0639-9
  12. A. Radford, L. Metz, and S. Chintala, "Unsupervised representation learning with deep convolutional generative adversarial networks," 2015 [Online]. Available: https://arxiv.org/abs/1511.06434.
  13. Y. Han and H. J. Kim, "Face morphing using generative adversarial networks," Journal of Digital Contents Society, vol. 19, no. 3, pp. 435-443, 2018.
  14. S. Reed, Z. Akata, X. Yan, L. Logeswaran, B. Schiele, and H. Lee, "Generative adversarial text to image synthesis," in Proceedings of the 33nd International Conference on Machine Learning (ICML), New York, NY, 2016, pp. 1060-1069.
  15. J. T. Springenberg, A. Dosovitskiy, T. Brox, and M. Riedmiller, "Striving for simplicity: the all convolutional net," 2014 [Online]. Available: https://arxiv.org/abs/1412.6806.
  16. D. Triantafyllidou and A. Tefas, "Face detection based on deep convolutional neural networks exploiting incremental facial part learning," in Proceeding of 2016 23rd International Conference on Pattern Recognition (ICPR), Cancun, Mexico, 2016, pp, 3560-3565.
  17. E. Learned-Miller, G. B. Huang, A. RoyChowdhury, H. Li, and G. Hua, "Labeled faces in the wild: a survey," Advances in Face Detection and Facial Image Analysis. Cham, Switzerland: Springer, 2016, pp. 189-248.
  18. Y. Susanti, T. Tokunaga, H. Nishikawa, and H. Obari, "Automatic distractor generation for multiple-choice English vocabulary questions," Research and Practice in Technology Enhanced Learning, vol. 13, article no. 15, 2018.