DOI QR코드

DOI QR Code

학습패치 크기와 ConvNeXt 적용이 CycleGAN 기반 위성영상 모의 정확도에 미치는 영향

The Effect of Training Patch Size and ConvNeXt application on the Accuracy of CycleGAN-based Satellite Image Simulation

  • Won, Taeyeon (Dept. of Advanced Technology Fusion, Konkuk University, Realtimevisual Inc.) ;
  • Jo, Su Min (Dept. of Technology Fusion Engineering, Konkuk University) ;
  • Eo, Yang Dam (Dept. of Civil and Environmental Engineering, Konkuk University)
  • 투고 : 2022.06.04
  • 심사 : 2022.06.25
  • 발행 : 2022.06.30

초록

본 연구에서는 딥러닝을 통해 고해상도 광학 위성영상에 동종센서로 촬영한 영상을 참조하여 폐색 영역을 복원하는 방법을 제안하였다. 패치 단위로 분할된 영상에서 원본 영상의 화소 분포를 최대한 유지하며 폐색 영역을 모의한 영상과 주변 영상의 자연스러운 연속성을 위해 ConvNeXt 블록을 적용한 CycleGAN (Cycle Generative Adversarial Network) 방법을 사용하여 실험을 진행하였고 이를 3개의 실험지역에 대해 분석하였다. 또한, 학습패치 크기를 512*512화소로 하는 경우와 2배 확장한 1024*1024화소 크기의 적용 결과도 비교하였다. 서로 특징이 다른 3개의 지역에 대하여 실험한 결과, ConvNeXt CycleGAN 방법론이 기존의 CycleGAN을 적용한 영상, Histogram matching 영상과 비교하여 개선된 R2 값을 보여줌을 확인하였다. 학습에 사용되는 패치 크기별 실험의 경우 1024*1024화소의 패치를 사용한 결과, 약 0.98의 R2값이 산출되었으며 영상밴드별 화소 분포를 비교한 결과에서도 큰 패치 크기로 학습한 모의 결과가 원본 영상과 더 유사한 히스토그램 분포를 나타내었다. 이를 통해, 기존의 CycleGAN을 적용한 영상 및 Histogram matching 영상보다 발전된 ConvNeXt CycleGAN을 사용할 때 원본영상과 유사한 모의 결과를 도출할 수 있었고, 성공적인 모의를 수행할 수 있음을 확인하였다.

A method of restoring the occluded area was proposed by referring to images taken with the same types of sensors on high-resolution optical satellite images through deep learning. For the natural continuity of the simulated image with the occlusion region and the surrounding image while maintaining the pixel distribution of the original image as much as possible in the patch segmentation image, CycleGAN (Cycle Generative Adversarial Network) method with ConvNeXt block applied was used to analyze three experimental regions. In addition, We compared the experimental results of a training patch size of 512*512 pixels and a 1024*1024 pixel size that was doubled. As a result of experimenting with three regions with different characteristics,the ConvNeXt CycleGAN methodology showed an improved R2 value compared to the existing CycleGAN-applied image and histogram matching image. For the experiment by patch size used for training, an R2 value of about 0.98 was generated for a patch of 1024*1024 pixels. Furthermore, As a result of comparing the pixel distribution for each image band, the simulation result trained with a large patch size showed a more similar histogram distribution to the original image. Therefore, by using ConvNeXt CycleGAN, which is more advanced than the image applied with the existing CycleGAN method and the histogram-matching image, it is possible to derive simulation results similar to the original image and perform a successful simulation.

키워드

과제정보

이성과는정부(과학기술정보통신부)의재원으로한국연구재단의 지원을 받아 수행된 연구임(No. 2019R1A2C1085618).

참고문헌

  1. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., and Houlsby, N. (2021), An image is worth 16x16 words: transformers for image recognition at scale, International Conference on Learning Representations 2021, 3-7 May. https://doi.org/10.48550/arXiv.2010.11929
  2. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. (2014), Generative adversarial nets, Neural Information Processing Systems-2014, 8-13 December, Montreal, Canada, pp. 2672-2680.
  3. Guo, Q., He, M., and Li, A. (2018), High-resolution remote-sensing image registration based on angle matching of edge point features, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, Vol. 11, No. 8, pp. 2881-2895. https://doi.org/10.1109/JSTARS.2018.2844295
  4. Isola, P., Zhu, J.Y., Zhou, T., and Efros, and A.A. (2017), Image-to-image translation with conditional adversarial networks, IEEE Conference on computer vision and pattern recognition-2017, 21-26 July, Honolulu, USA, pp. 1125-1134.
  5. Kim, D., Wang, K., Sclaroff, S., and Saenko, K. (2022), A Broad Study of Pre-training for Domain Generalization and Adaptation, arXiv preprint arXiv:2203.11819. https://doi.org/10.48550/arXiv.2203.11819
  6. LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., and Jackel, L.D. (1989), Backpropagation applied to handwritten zip code recognition, Neural computation, Vol. 1, No. 4, pp. 541-551.https://doi.org/10.1162/neco.1989.1.4.541
  7. Lee, M.H., Lee, S.B., Eo, Y.D., Kim, S.W., Woo, J.H., and Han, S.H. (2017), A comparative study on generating simulated Landsat NDVI images using data fusion and regression method-the case of the Korean Peninsula, Environmental Monitoring and Assessment, Vol. 189, No. 7, pp. 1-13. https://doi.org/10.1007/s10661-017-6034-z
  8. Lee, S.B., Park, W.Y., Eo, Y.D., Pyeon, M.W., Han, S., Yeon, S.H., and Lee, B.K. (2017), Analysis on the applicability of simulated image from SPOT 4 HRVIR image, Korean Society of Civil Engineers, Vol. 21, No. 4, pp. 1434-1442. https://doi.org/10.1007/s12205-016-0522-5
  9. Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. 2021), Swin Transformer: Hierarchical Vision Transformer using Shifted Windows, arXiv preprint arXiv:2103.14020. https://doi.org/10.48550/arXiv.2103.14030
  10. Liu, M., Ma, J., Zheng, Q., Liu, Y., and Shi, G. (2022), 3D Object Detection Based on Attention and Multi-Scale Feature Fusion, Sensor 2022, Vol. 22, No. 10. https://doi.org/10.3390/s22103935
  11. Liu Z., Mao, H., Wu, C.Y., Feichtenhofer, C., Darrell, T., Xie, S., (2022), A ConvNet for the 2020s, arXiv preprint arXiv:2201.03545. https://doi.org/10.48550/arXiv.2201.03545
  12. Liu, Y., Yano, T., Nishiyama, S., and Kimura, R. (2017), Radiometric correction for linear change-detection techniques: analysis in bi-temporal space, International Journal of Remote Sensing, Vol. 28, No. 22, pp. 5143-5157. https://doi.org/10.1080/01431160701268954
  13. Micikevicius, P., Narang, S., Alben, J., Diamos, G., Elsen, E., Garcia, D., Ginsburg, B., Houston, M., Kuchaiev, O., Venkatesh, G., and Wu, H. (2018), Mixed precision training, International Conference on Learning Representations-2018, 30-3 April-May, Vancouver, Canada.
  14. Su, N., Zhang, Y., Tian, S., Yan, Y., and Miao, X. (2016), Shadow detection and removal for occluded object information recovery in urban high-resolution panchromatic satellite images, IEEE Applied Earth Observations and Remote Sensing, Vol. 9, No. 6, pp. 2568-2582. https://doi.org/10.1109/JSTARS.2016.2570234
  15. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., and Polosukhin, I. (2017), Attention is all you need, Neural Information Processing Systems-2017, 4-9 December, Long Beach, USA, pp. 5998-6008.
  16. Won.T. and Eo, Y.D. (2022), An experiment on image restoration Applying the Cycle Generative adversarial network to partial occlusion Kompsat-3A image, Korean Journal of Remote Sensing, Vol. 38, No. 1, pp.33-43. https://doi.org/10.7780/kjrs.2022.38.1.3
  17. Yoo, E.J. and Lee, D.C. (2010), Patch-based processing and occlusion area recovery for true orthoimage generation, Journal of the Korean Society of Surveying,Vol. 28, No. 1, pp. 83-92. (in Korean with English abstract)
  18. Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., and Huang, T.S. (2019), Free-form image inpainting with gated convolution, International Conference on Computer Vision, pp. 4471-4480. https://doi.org/10.48550/arXiv.1806.03589
  19. Zhu, J.Y., Park, T., Isola, P., and Efros, A.A. (2017), Unpaired image-to-image translation using cycle-consistent adversarial networks, International Conference on Computer Vision, pp. 2223-2232. https://doi.org/10.48550/arXiv.1703.10593
  20. Zhang, Q., Yuan, Q., Zeng, C., Li, X., and Wei, Y. (2018), Missing data reconstruction in remote sensing image with a unified spatial-temporal-spectral deep convolutional neural network, IEEE Transactions on Geoscience and Remote Sensing, Vol. 56, No. 8, pp. 4274-4288. https://doi.org/10.1109/TGRS.2018.2810208