DOI QR코드

DOI QR Code

빅 데이터의 MapReduce를 이용한 효율적인 병렬 유전자 알고리즘 기법

The Efficient Method of Parallel Genetic Algorithm using MapReduce of Big Data

  • 투고 : 2013.08.15
  • 심사 : 2013.09.15
  • 발행 : 2013.10.25

초록

빅 데이터는 일반적으로 사용되는 데이터 관리 시스템으로 데이터의 처리, 수집, 저장, 탐색, 분석을 할 수 없는 큰 규모의 데이터를 말한다. 빅 데이터 기술인 맵 리듀스(MapReduce)를 이용한 병렬 GA 연구는 Hadoop 분산처리환경을 이용하여, 맵 리듀스에서 GA를 수행함으로써 GA의 병렬처리를 쉽게 구현할 수 있다. 기존의 맵 리듀스를 이용한 GA들은 GA를 맵 리듀스에 적절히 변형하여 적용하였지만 잦은 데이터 입출력에 의한 수행시간 지연으로 우수한 성능을 보이지 못하였다. 본 논문에서는 기존의 맵 리듀스를 이용한 GA의 성능을 개선하기 위해, 맵과 리듀싱과정을 개선하여 맵 리듀스 특징을 이용한 새로운 MRPGA(MapReduce Parallel Genetic Algorithm)기법을 제안하였다. 기존의 PGA의 topology 구성과 migration 및 local search기법을 MRPGA에 적용하여 최적해를 찾을 수 있었다. 제안한 기법은 기존에 맵 리듀스 SGA에 비해 수렴속도가 1.5배 빠르며, sub-generation 반복횟수에 따라 최적해를 빠르게 찾을 수 있었다. 또한, MRPGA를 활용하여 빅 데이터 기술의 처리 및 분석 성능을 향상시킬 수 있다.

Big Data is data of big size which is not processed, collected, stored, searched, analyzed by the existing database management system. The parallel genetic algorithm using the Hadoop for BigData technology is easily realized by implementing GA(Genetic Algorithm) using MapReduce in the Hadoop Distribution System. The previous study that the genetic algorithm using MapReduce is proposed suitable transforming for the GA by MapReduce. However, they did not show good performance because of frequently occurring data input and output. In this paper, we proposed the MRPGA(MapReduce Parallel Genetic Algorithm) using improvement Map and Reduce process and the parallel processing characteristic of MapReduce. The optimal solution can be found by using the topology, migration of parallel genetic algorithm and local search algorithm. The convergence speed of the proposal method is 1.5 times faster than that of the existing MapReduce SGA, and is the optimal solution can be found quickly by the number of sub-generation iteration. In addition, the MRPGA is able to improve the processing and analysis performance of Big Data technology.

키워드

참고문헌

  1. Young Jun Kim1, Kyung Soon Hwang and Keon Myung Lee, "A MapReduce-based Algorithm for Semantic Hashing with Extended Boundaries," Proceedings of KIIS Fall Conference, Vol. 22, No. 2, 2012
  2. Chong-Ho Yi and Dong W. Kim, "Comparisons of Robot-Moving Strategies with Evolutionary Algorithm and Neuro-Fuzzy Method", Journal of KIIT, Vol. 10, No. 2, pp. 227-232, 2012..
  3. E. Cantu-Paz, "A survey of parallel genetic algorithms," Calculateurs Paralleles, Reseaux et Systemes Repartis, Vol. 10, No. 2, pp. 141-171, 1998.
  4. Jae Hoon Cho , Dae-Jong Lee , Jin-Il Park , Myung-Geun Chun, "Hybrid Feature Selection Using Genetic Algorithm and Information Theory," INTERNATIONAL JOURNAL of FUZZY LOGIC and INTELLIGENT SYSTEMS, Vol.13 No.1, pp 69-77, 2013
  5. Myung-Mook Han, "Parallel Genetic Algorithm based on a Multiprocessor System FIN and Its Application to a Classifier Machine," INTERNATIONAL JOURNAL of FUZZY LOGIC and INTELLIGENT SYSTEMS, Vol.8, No5, pp 61-71, 1998
  6. Dongho Song, Yougil Lee, Tae-Hyoung Kim, "A Study on Distributed Particle Swarm Optimization Algorithm with Quantum-infusion Mechanism," INTERNATIONAL JOURNAL of FUZZY LOGIC and INTELLIGENT, Vol.22, No.4, pp 527-531, 2012
  7. Seung-Hyung Jung, Jeoung-Nae Chi, Sung-Kwun Oh, Hyun-ki Kim, "Design of Optimized Fuzzy PD Cascade Controller Based on Parallel Genetic Algorithms," INTERNATIONAL JOURNAL of FUZZY LOGIC and INTELLIGENT, Vol.19, No.3, pp 329-336, 2009
  8. Dean, J. and Ghemawat, S., "MapReduce: Simplified Data Processing on Large Clusters," Communication of the ACM, Vol. 51, No. 1, pp.107-113, 2008.
  9. Yun-Hee Kang amd Myoung-Woo Hong, "Sensory Data Processing by Using Hadoop Framework," Journal of KIIT, Vol. 11, No.2, pp 169-174 , 2013
  10. Feng Wang, Jie Qiu, Jie Yang, Bo Dong, Xinhui Li, and Ying Li, "Hadoop high availabilitythrough metadata replication", Proceeding of the first International Workshop on Cloud datamanagement, pp. 37-44, 2009.
  11. Suan Lee and Jinho Kim, "Sort-Based Distributed Parallel Data Cube Computation Algorithm using MapReduce," Journal of The Institute of Electronics Engineers of Korea, Vol. 49, NO. 9, 2012.
  12. Ghemawat, S., Gobioff, H., and Leung, S. T.,"The Google File System," In Proc. 19th on Operating Systems Principles, pp. 29-43, 2003.
  13. HDFS, http://hadoop.apache.org/hdfs/.
  14. Hadoop, http://hadoop.apache.org/.
  15. Abhishek Verma, Xavier Llor'a, David E. Goldberg, and Roy H. Campbell., "Scaling Genetic Algorithms Using MapReduce," In Proceedings of the Ninth International Conference on Intelligent Systems Design andApplications (ISDA) , 2009.
  16. D. E. Goldberg. Genetic algorithms in search, optimization, and machine learning. Addison-Wesley, Reading, MA, 1989.
  17. D. E. Goldberg. The Design of Innovation: Lessons fromand for Competent Genetic Algorithms. Kluwer Academic Publishers, Norwell, MA, 2002.
  18. G. Sywerda., "Uniform crossover in genetic algorithms," Proceedings of the third international conference on Genetic algorithms, pages 2-9, San 1989.
  19. J. Schaffer and L. Eshelman., "On Crossover as an Evolutionary Viable Strategy," Proceedings of the 4th International Conference on Genetic Algorithms, pp 61-68, 1991.

피인용 문헌

  1. A Big Data Preprocessing using Statistical Text Mining vol.25, pp.5, 2015, https://doi.org/10.5391/JKIIS.2015.25.5.470
  2. Big Data Analysis Using Principal Component Analysis vol.25, pp.6, 2015, https://doi.org/10.5391/JKIIS.2015.25.6.592