DOI QR코드

DOI QR Code

Parallel Processing of k-Means Clustering Algorithm for Unsupervised Classification of Large Satellite Images: A Hybrid Method Using Multicores and a PC-Cluster

대용량 위성영상의 무감독 분류를 위한 k-Means Clustering 알고리즘의 병렬처리: 다중코어와 PC-Cluster를 이용한 Hybrid 방식

  • Received : 2019.10.31
  • Accepted : 2019.11.28
  • Published : 2019.12.31

Abstract

In this study, parallel processing codes of k-means clustering algorithm were developed and implemented in a PC-cluster for unsupervised classification of large satellite images. We implemented intra-node code using multicores of CPU (Central Processing Unit) based on OpenMP (Open Multi-Processing), inter-nodes code using a PC-cluster based on message passing interface, and hybrid code using both. The PC-cluster consists of one master node and eight slave nodes, and each node is equipped with eight multicores. Two operating systems, Microsoft Windows and Canonical Ubuntu, were installed in the PC-cluster in turn and tested to compare parallel processing performance. Two multispectral satellite images were tested, which are a medium-capacity LANDSAT 8 OLI (Operational Land Imager) image and a high-capacity Sentinel 2A image. To evaluate the performance of parallel processing, speedup and efficiency were measured. Overall, the speedup was over N / 2 and the efficiency was over 0.5. From the comparison of the two operating systems, the Ubuntu system showed two to three times faster performance. To confirm that the results of the sequential and parallel processing coincide with the other, the center value of each band and the number of classified pixels were compared, and result images were examined by pixel to pixel comparison. It was found that care should be taken to avoid false sharing of OpenMP in intra-node implementation. To process large satellite images in a PC-cluster, code and hardware should be designed to reduce performance degradation caused by file I / O. Also, it was found that performance can differ depending on the operating system installed in a PC-cluster.

본 연구에서는 대용량 위성영상의 무감독분류를 위해 k-means clustering 알고리즘의 병렬처리 코드를 개발하여 PC-cluster에서 구현하였다. 이를 위해 OpenMP (Open Multi-Processing)를 기반으로 CPU (Central Processing Unit)의 다중코어를 이용하는 intra-node 코드와 message passing interface를 기반으로 PC-cluster를 이용하는 inter-nodes 코드, 그리고 이 둘을 병용하는 hybrid 코드를 구현하였다. 본 연구에 사용한 PC-cluster는 한 대의 마스터 노드와 여덟 대의 슬래이브 노드로 구성되어 있고 각 노드에는 여덟 개의 다중코어가 장착되어 있다. PC-cluster에는 Microsoft Windows와 Canonical Ubuntu의 두 가지 운영체제를 설치하여 병렬처리 성능을 비교하였다. 실험에 사용한 자료는 두 가지 다중분광 위성영상으로서 중용량인 LANDSAT 8 OLI (Operational Land Imager) 영상과 대용량인 Sentinel 2A 영상이다. 병렬처리의 성능을 평가하기 위하여 speedup과 efficiency를 측정한 결과 전반적으로 speedup은 N/2 이상, efficiency는 0.5 이상으로 나타났다. Microsoft Windows와 Canonical Ubuntu를 비교한 결과 Ubuntu가 2-3배의 빠른 결과를 나타내었다. 순차처리와 병렬처리 결과가 일치하는지 확인하기 위해 각 클래스의 밴드별 중심값과 분류된 화소의 수를 비교하고 결과 영상간 화소대 화소 비교도 수행하였다. Intra-node 코드를 구현할 때에는 OpenMP에 의한 false sharing이 발생하지 않도록 주의해야 하고, PC-cluster에서 대용량 위성영상을 처리하기 위해서는 파일 I/O에 의한 성능저하를 줄일 수 있도록 코드 및 하드웨어를 설계해야 함을 알 수 있었다. 또한 PC-cluster에 설치된 운영체제에 따라서도 성능 차이가 발생함을 알 수 있었다.

Keywords

References

  1. Argonne National Laboratory (2012), The message passing interface (MPI) standard, Argonne National Laboratory, https://www.mcs.anl.gov/research/projects/mpi/ (last date accessed: 30 August 2019).
  2. Eager, D.L., Zahorjan, J., and Lazowska, E.D. (1989), Speedup versus efficiency in parallel systems, IEEE Transactions on Computers, Vol. 38, No. 3, pp. 408-423. https://doi.org/10.1109/12.21127
  3. Fredj, H.B., Ltaif, M., Ammar, A., and Souani, C. (2017), Parallel implementation of Sobel filter using CUDA, 2017 International Conference on Control, Automation and Diagnosis (ICCAD), 19-21 January, Hammamet, Tunisia, pp. 209-212.
  4. Gonzalez, C., Resano, J., Mozos, D., Plaza, A., and Valencia, D. (2010), FPGA implementation of the pixel purity index algorithm for remotely sensed hyperspectral image analysis, EURASIP Journal on Advances in Signal Processing 2010, Vol. 2010:969806.
  5. Gustafson, J.L. (2011), Amdahl's Law, In: Padua, D. (eds.), Encyclopedia of Parallel Computing, Springer, Boston, MA, USA, pp. 53-60.
  6. Han, S. (2017), Parallel processing of k-means clustering algorithm for unsupervised classification of large satellite imagery, Journal of the Korean Society of Survey, Geodesy, Photogrammetry, and Cartography, Vol. 35, No. 3, pp. 187-194. (in Korean with English abstract) https://doi.org/10.7848/ksgpc.2017.35.3.187
  7. IBM (2019), Parallel processing environments, IBM Knowledge Center, https://www.ibm.com/support/knowledgecenter/en/SSZJPZ_11.7.0/com.ibm.swg.im.iis.ds.parjob.dev.doc/topics/parallelprocessingenvironments.html (last date accessed: 19 November 2019).
  8. Koo, I.H (2012), High-speed processing of satellite image using GPU, Master's thesis, Chungnam National University, Daejeon, Republic of Korea, 69p.
  9. Lu, Y., Gao, Q., Chen, S., Sun, D., Xia, Y., and Peng, X. (2017), Fast implementation of image mosaicing on GPU, 2017 10th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), 14-16 October, Shanghai, China, pp. 1-5.
  10. OpenMP ARB (2016), The OpenMP API specification for parallel programming, OpenMP ARB, http://www.openmp.org (last date accessed: 31 October 2019).
  11. Plaza, A., Valencia, D., Plaza, J., and Martinez, P. (2006), Commodity cluster-based parallel processing of hyperspectral imagery, Journal of Parallel and Distributed Computing, Vol. 66, No. 3, pp. 345-358. https://doi.org/10.1016/j.jpdc.2005.10.001
  12. Sanchez, S. and Plaza, A. (2010), GPU implementation of the pixel purity index algorithm for hyperspectral image analysis, 2010 IEEE International Conference on Cluster Computing Workshops and Posters (CLUSTER WORKSHOPS), 20-24 September, Heraklion, Crete, Greece, pp. 1-7.
  13. Sugumaran, R., Hegeman, J.W., Sardeshmukh, V.B., Armstrong, M.P., Hegeman, J.W., Sardeshmukh, V.B., and Armstrong, M.P. (2018), Processing remote-sensing data in cloud computing environments, In Remote Sensing Handbook - Three Volume Set, CRC Press, Boca Raton.
  14. Sun, X., Li, M., Liu, Y., Tan, L., and Liu, W. (2000), Accelerated segmentation approach with CUDA for high spatial resolution remotely sensed imagery based on improved Mean Shift, 2009 Joint Urban Remote Sensing Event, 20-22 May, Shanghai, China, pp. 1-6.
  15. Wang, P., Wang, J., Chen, Y., and Ni, G. (2013), Rapid processing of remote sensing images based on cloud computing, Future Generation Computer Systems, Vol. 29, No. 8, pp. 1963-1968. https://doi.org/10.1016/j.future.2013.05.002