The Effect of Training Patch Size and ConvNeXt application on the Accuracy of CycleGAN-based Satellite Image Simulation

Won, Taeyeon;Jo, Su Min;Eo, Yang Dam;

doi:10.7848/ksgpc.2022.40.3.177

한국측량학회지 (Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography)

제40권3호
/
Pages.177-185
/
2022
/
1598-4850(pISSN)
/
2288-260X(eISSN)

한국측량학회 (Korean Society of Surveying, Geodesy, Photogrammetry and Cartography)

DOI QR Code

학습패치 크기와 ConvNeXt 적용이 CycleGAN 기반 위성영상 모의 정확도에 미치는 영향

The Effect of Training Patch Size and ConvNeXt application on the Accuracy of CycleGAN-based Satellite Image Simulation

Won, Taeyeon (Dept. of Advanced Technology Fusion, Konkuk University, Realtimevisual Inc.) ;
Jo, Su Min (Dept. of Technology Fusion Engineering, Konkuk University) ;
Eo, Yang Dam (Dept. of Civil and Environmental Engineering, Konkuk University)

투고 : 2022.06.04
심사 : 2022.06.25
발행 : 2022.06.30

https://doi.org/10.7848/ksgpc.2022.40.3.177 인용 PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

본 연구에서는 딥러닝을 통해 고해상도 광학 위성영상에 동종센서로 촬영한 영상을 참조하여 폐색 영역을 복원하는 방법을 제안하였다. 패치 단위로 분할된 영상에서 원본 영상의 화소 분포를 최대한 유지하며 폐색 영역을 모의한 영상과 주변 영상의 자연스러운 연속성을 위해 ConvNeXt 블록을 적용한 CycleGAN (Cycle Generative Adversarial Network) 방법을 사용하여 실험을 진행하였고 이를 3개의 실험지역에 대해 분석하였다. 또한, 학습패치 크기를 512*512화소로 하는 경우와 2배 확장한 1024*1024화소 크기의 적용 결과도 비교하였다. 서로 특징이 다른 3개의 지역에 대하여 실험한 결과, ConvNeXt CycleGAN 방법론이 기존의 CycleGAN을 적용한 영상, Histogram matching 영상과 비교하여 개선된 R² 값을 보여줌을 확인하였다. 학습에 사용되는 패치 크기별 실험의 경우 1024*1024화소의 패치를 사용한 결과, 약 0.98의 R²값이 산출되었으며 영상밴드별 화소 분포를 비교한 결과에서도 큰 패치 크기로 학습한 모의 결과가 원본 영상과 더 유사한 히스토그램 분포를 나타내었다. 이를 통해, 기존의 CycleGAN을 적용한 영상 및 Histogram matching 영상보다 발전된 ConvNeXt CycleGAN을 사용할 때 원본영상과 유사한 모의 결과를 도출할 수 있었고, 성공적인 모의를 수행할 수 있음을 확인하였다.

A method of restoring the occluded area was proposed by referring to images taken with the same types of sensors on high-resolution optical satellite images through deep learning. For the natural continuity of the simulated image with the occlusion region and the surrounding image while maintaining the pixel distribution of the original image as much as possible in the patch segmentation image, CycleGAN (Cycle Generative Adversarial Network) method with ConvNeXt block applied was used to analyze three experimental regions. In addition, We compared the experimental results of a training patch size of 512*512 pixels and a 1024*1024 pixel size that was doubled. As a result of experimenting with three regions with different characteristics,the ConvNeXt CycleGAN methodology showed an improved R² value compared to the existing CycleGAN-applied image and histogram matching image. For the experiment by patch size used for training, an R² value of about 0.98 was generated for a patch of 1024*1024 pixels. Furthermore, As a result of comparing the pixel distribution for each image band, the simulation result trained with a large patch size showed a more similar histogram distribution to the original image. Therefore, by using ConvNeXt CycleGAN, which is more advanced than the image applied with the existing CycleGAN method and the histogram-matching image, it is possible to derive simulation results similar to the original image and perform a successful simulation.

키워드

과제정보

이성과는정부(과학기술정보통신부)의재원으로한국연구재단의 지원을 받아 수행된 연구임(No. 2019R1A2C1085618).

참고문헌

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., and Houlsby, N. (2021), An image is worth 16x16 words: transformers for image recognition at scale, International Conference on Learning Representations 2021, 3-7 May. https://doi.org/10.48550/arXiv.2010.11929
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. (2014), Generative adversarial nets, Neural Information Processing Systems-2014, 8-13 December, Montreal, Canada, pp. 2672-2680.
Guo, Q., He, M., and Li, A. (2018), High-resolution remote-sensing image registration based on angle matching of edge point features, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, Vol. 11, No. 8, pp. 2881-2895. https://doi.org/10.1109/JSTARS.2018.2844295
Isola, P., Zhu, J.Y., Zhou, T., and Efros, and A.A. (2017), Image-to-image translation with conditional adversarial networks, IEEE Conference on computer vision and pattern recognition-2017, 21-26 July, Honolulu, USA, pp. 1125-1134.
Kim, D., Wang, K., Sclaroff, S., and Saenko, K. (2022), A Broad Study of Pre-training for Domain Generalization and Adaptation, arXiv preprint arXiv:2203.11819. https://doi.org/10.48550/arXiv.2203.11819
LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., and Jackel, L.D. (1989), Backpropagation applied to handwritten zip code recognition, Neural computation, Vol. 1, No. 4, pp. 541-551.https://doi.org/10.1162/neco.1989.1.4.541
Lee, M.H., Lee, S.B., Eo, Y.D., Kim, S.W., Woo, J.H., and Han, S.H. (2017), A comparative study on generating simulated Landsat NDVI images using data fusion and regression method-the case of the Korean Peninsula, Environmental Monitoring and Assessment, Vol. 189, No. 7, pp. 1-13. https://doi.org/10.1007/s10661-017-6034-z
Lee, S.B., Park, W.Y., Eo, Y.D., Pyeon, M.W., Han, S., Yeon, S.H., and Lee, B.K. (2017), Analysis on the applicability of simulated image from SPOT 4 HRVIR image, Korean Society of Civil Engineers, Vol. 21, No. 4, pp. 1434-1442. https://doi.org/10.1007/s12205-016-0522-5
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. 2021), Swin Transformer: Hierarchical Vision Transformer using Shifted Windows, arXiv preprint arXiv:2103.14020. https://doi.org/10.48550/arXiv.2103.14030
Liu, M., Ma, J., Zheng, Q., Liu, Y., and Shi, G. (2022), 3D Object Detection Based on Attention and Multi-Scale Feature Fusion, Sensor 2022, Vol. 22, No. 10. https://doi.org/10.3390/s22103935
Liu Z., Mao, H., Wu, C.Y., Feichtenhofer, C., Darrell, T., Xie, S., (2022), A ConvNet for the 2020s, arXiv preprint arXiv:2201.03545. https://doi.org/10.48550/arXiv.2201.03545
Liu, Y., Yano, T., Nishiyama, S., and Kimura, R. (2017), Radiometric correction for linear change-detection techniques: analysis in bi-temporal space, International Journal of Remote Sensing, Vol. 28, No. 22, pp. 5143-5157. https://doi.org/10.1080/01431160701268954
Micikevicius, P., Narang, S., Alben, J., Diamos, G., Elsen, E., Garcia, D., Ginsburg, B., Houston, M., Kuchaiev, O., Venkatesh, G., and Wu, H. (2018), Mixed precision training, International Conference on Learning Representations-2018, 30-3 April-May, Vancouver, Canada.
Su, N., Zhang, Y., Tian, S., Yan, Y., and Miao, X. (2016), Shadow detection and removal for occluded object information recovery in urban high-resolution panchromatic satellite images, IEEE Applied Earth Observations and Remote Sensing, Vol. 9, No. 6, pp. 2568-2582. https://doi.org/10.1109/JSTARS.2016.2570234
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., and Polosukhin, I. (2017), Attention is all you need, Neural Information Processing Systems-2017, 4-9 December, Long Beach, USA, pp. 5998-6008.
Won.T. and Eo, Y.D. (2022), An experiment on image restoration Applying the Cycle Generative adversarial network to partial occlusion Kompsat-3A image, Korean Journal of Remote Sensing, Vol. 38, No. 1, pp.33-43. https://doi.org/10.7780/kjrs.2022.38.1.3
Yoo, E.J. and Lee, D.C. (2010), Patch-based processing and occlusion area recovery for true orthoimage generation, Journal of the Korean Society of Surveying,Vol. 28, No. 1, pp. 83-92. (in Korean with English abstract)
Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., and Huang, T.S. (2019), Free-form image inpainting with gated convolution, International Conference on Computer Vision, pp. 4471-4480. https://doi.org/10.48550/arXiv.1806.03589
Zhu, J.Y., Park, T., Isola, P., and Efros, A.A. (2017), Unpaired image-to-image translation using cycle-consistent adversarial networks, International Conference on Computer Vision, pp. 2223-2232. https://doi.org/10.48550/arXiv.1703.10593
Zhang, Q., Yuan, Q., Zeng, C., Li, X., and Wei, Y. (2018), Missing data reconstruction in remote sensing image with a unified spatial-temporal-spectral deep convolutional neural network, IEEE Transactions on Geoscience and Remote Sensing, Vol. 56, No. 8, pp. 4274-4288. https://doi.org/10.1109/TGRS.2018.2810208

한국측량학회지 (Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography)

학습패치 크기와 ConvNeXt 적용이 CycleGAN 기반 위성영상 모의 정확도에 미치는 영향

The Effect of Training Patch Size and ConvNeXt application on the Accuracy of CycleGAN-based Satellite Image Simulation

초록

키워드

과제정보

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)