DOI QR코드

DOI QR Code

대형 이미지 데이터셋 구축을 위한 객체 엣지 기반 이미지 생성 기법

Object Edge-based Image Generation Technique for Constructing Large-scale Image Datasets

  • Ju-Hyeok Lee (School. of Computer Engineering & Applied Mathematics, Computer System Institute, Hankyong National University) ;
  • Mi-Hui Kim (School. of Computer Engineering & Applied Mathematics, Computer System Institute, Hankyong National University)
  • 투고 : 2023.08.05
  • 심사 : 2023.09.25
  • 발행 : 2023.09.30

초록

딥러닝의 발전은 컴퓨터 비전 문제를 해결할 수 있지만, 높은 정확도를 위해서는 대규모 데이터셋이 필요하다. 본 논문에서는 객체 바운딩 박스와 이미지 엣지 성분을 이용한 이미지 생성 기법을 제안한다. 객체 탐지를 통해 이미지 내의 객체 바운딩 박스를 추출하고 이미지 엣지 성분을 함께 이미지 생성모델의 입력값으로 사용하여 새로운 이미지 데이터를 생성한다. 실험 결과, 제안 기법으로 생성된 이미지는 이미지 품질 평가에서 소스 이미지와 유사한 품질을 보였고, 딥러닝 훈련과정에서도 좋은 성능을 보였다.

Deep learning advancements can solve computer vision problems, but large-scale datasets are necessary for high accuracy. In this paper, we propose an image generation technique using object bounding boxes and image edge components. The object bounding boxes are extracted from the images through object detection, and image edge components are used as input values for the image generation model to create new image data. As results of experiments, the images generated by the proposed method demonstrated similar image quality to the source images in the image quality assessment, and also exhibited good performance during the deep learning training process.

키워드

참고문헌

  1. LeCun Y, Bengio Y, Hinton G. "Deep learning," Nature 521, pp.436-444, 2015. https://doi.org/10.1038/nature14539
  2. Shorten, C., Khoshgoftaar, T.M. "A survey on Image Data Augmentation for Deep Learning," J Big Data 6, No.60, 2019. DOI: 10.1186/s40537-019-0197-0
  3. C. Sun, A. Shrivastava, S. Singh and A. Gupta, "Revisiting Unreasonable Effectiveness of Data in Deep Learning Era," 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, pp.843-852, 2017. DOI:10.1109/ICCV.2017.97
  4. Athanasios Voulodimos, Nikolaos Doulamis, Anastasios Doulamis, Eftychios Protopapadakis, "Deep Learning for Computer Vision: A Brief Review," Computational Intelligence and Neuroscience, vol. 2018. p.13, 2018.
  5. T. Karras, S. Laine and T. Aila, "A Style-Based Generator Architecture for Generative Adversarial Networks," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.4396-4405, 2019. DOI: 10.48550/arXiv.1812.04948
  6. Ju-hyeok Lee, Mi-hui Kim. "Synthetic data generation technique using object bounding box and original image combination," Conference of korea infomation processing society, Vol.30, No.1, pp.476-478, 2023.
  7. Perez, Luis & Wang, Jason., "The Effectiveness of Data Augmentation in Image Classification using Deep Learning," Computer Vision and Pattern Recognition, p.8, 2017. DOI: 10.48550/arXiv.1712.04621
  8. Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton, "ImageNet classification with deep convolutional neural networks," Advances in Neural Information Processing Systems (NIPS), pp.1097-1105, 2012. DOI: 10.1145/3065386
  9. N. V. Chawla, K. W. Bowyer, L. O. Hall, W. P. Kegelmeyer, "SMOTE: Synthetic Minority Over-Sampling Technique," Journal of Artificial Intelligence Research 16, pp.321-357, 2002. DOI: 10.48550/arXiv.1106.1813
  10. Diederik P. Kingma, Max Welling., "Auto-Encoding variational bayes," International conference on Learning 'Representations (ICLR), pp.14, 2014.
  11. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y., "Generative adversarial nets," Advances in Neural Information Processing Systems (NIPS), pp.2672-2680, 2014.
  12. Gao, Wenshuo, et al. "An improved Sobel edge detection," 2010 3rd International conference on computer science and information technology, Vol.5. 2010. DOI: 10.1109/ICCSIT.2010.5563693
  13. SchubertSlySchubert, kaggle, Cats and Dogs dataset to train a DL model, cat and dog dataset [Internet], https://www.kaggle.com/datasets/tongpython/cat-and-dog
  14. Yuanji. W, Jianhua. L,Yi, L,Yao. F, Qinzhong,. J, "Image quality evaluation based on image weighted separating block peak signal to noise ratio," IEEE, In International Conference on Neural Networks and Signal Processing, Vol.2, pp.994-997, 2003. DOI: 10.1109/ICNNSP.2003.1281036
  15. Wang, Zhou, et al. "Image quality assessment: from error visibility to structural similarity," IEEE transactions on image processing, Vol.13, No.4, pp.600-612, 2004. DOI: 10.1109/TIP.2003.819861