DOI QR코드

DOI QR Code

블록 중심 그래프 처리 시스템의 부하 분산을 위한 동적 블록 재배치 기법

Dynamic Block Reassignment for Load Balancing of Block Centric Graph Processing Systems

  • 투고 : 2017.12.15
  • 심사 : 2018.03.13
  • 발행 : 2018.05.31

초록

최근 웹, 소셜 네트워크 서비스, 모바일, 사물인터넷 등의 ICT 기술의 발전으로 인해 처리 및 분석이 필요한 그래프 데이터의 규모가 급속하게 증가하였다. 이러한 대규모 그래프 데이터는 단일 기기에서의 처리가 어렵기 때문에 여러 기기에 나누어 분산/병렬 처리하는 것이 필요하다. 기존 그래프 처리 알고리즘들은 단일 메모리 환경을 기반으로 연구되어 분산/병렬 처리환경에 적용되기 힘들다. 이에 대규모 그래프의 보다 효과적인 분산/병렬 처리를 위해 정점 중심 방식의 그래프 처리 시스템들과, 정점 중심 방식의 단점을 보완한 블록 중심 방식의 그래프 처리 시스템들이 등장하였다. 이러한 시스템들은 초기 그래프 분할 상태가 전체 처리 성능에 상당한 영향을 미친다. 한 번에 최적의 상태로 그래프를 분할하는 것은 매우 어려운 문제이므로, 그래프 처리 시간에 점진적으로 그래프 분할 상태를 개선하는 여러 로드 밸런싱 기법들이 연구되었다. 그러나 기존 기법들은 대부분 정점 중심 그래프 처리 시스템을 대상으로 하여 블록 중심 그래프 처리 시스템에 적용이 어렵다. 본 논문에서는 블록 중심 그래프 처리 시스템을 대상으로 적용 가능한 로드 밸런싱 기법을 제안한다. 제안 기법은 동적으로 블록을 재배치하여 점진적으로 그래프 분할 상태를 개선시키며, 해를 찾아나가는 과정에서 지역 최적해를 벗어나기 위한 블록 분할 전략을 함께 제시한다.

The scale of graph data has been increased rapidly because of the growth of mobile Internet applications and the proliferation of social network services. This brings upon the imminent necessity of efficient distributed and parallel graph processing approach since the size of these large-scale graphs are easily over a capacity of a single machine. Currently, there are two popular parallel graph processing approaches, vertex-centric graph processing and block centric processing. While a vertex-centric graph processing approach can easily be applied to the parallel processing system, a block-centric graph processing approach is proposed to compensate the drawbacks of the vertex-centric approach. In these systems, the initial quality of graph partition affects to the overall performance significantly. However, it is a very difficult problem to divide the graph into optimal states at the initial phase. Thus, several dynamic load balancing techniques have been studied that suggest the progressive partitioning during the graph processing time. In this paper, we present a load balancing algorithms for the block-centric graph processing approach where most of dynamic load balancing techniques are focused on vertex-centric systems. Our proposed algorithm focus on an improvement of the graph partition quality by dynamically reassigning blocks in runtime, and suggests block split strategy for escaping local optimum solution.

키워드

참고문헌

  1. G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I., Horn, N. Leiser, and G. Czajkowski, "Pregel: a system for large-scale graph processing," In Proc. 2010 ACM SIGMOD International Conference on Management of Data, ACM, pp.135-146, 2010.
  2. The Apache Software Foundation, "Welcome to ApacheTM $Hadoop^{(R)}$!," The Apache Software Foundation, 2014. [Online]. Available: http://hadoop.apache.org. [Accessed Dec. 1, 2017].
  3. U. Kang, C. E. Tsourakakis, and C. Faloutsos, "Pegasus: A peta-scale graph mining system implementation and observations," In Proc. IEEE 9th International Conference on Data Mining, IEEE, pp. 229-238, 2009.
  4. J. Lin and M. Schatz, "Design patterns for efficient graph algorithms in MapReduce," In Proc. 8th Workshop on Mining and Learning with Graphs, ACM, pp.78-85, 2010.
  5. J. Dean and S. Ghemawat, "MapReduce: simplified data processing on large clusters," Communications of the ACM, Vol.51, No.1, pp.107-113, 2008. https://doi.org/10.1145/1327452.1327492
  6. Y. Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola, and J. M. Hellerstein, "Distributed GraphLab: a framework for machine learning and data mining in the cloud," In Proc. VLDB Endowment, Vol.5, No.8, pp.716-727, 2012.
  7. J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin, "PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs," OSDI, Vol.12, No.1, p.2, 2012.
  8. J. E. Gonzalez, R. S., Xin, A. Dave, D. Crankshaw, M. J. Franklin, and I. Stoica, "GraphX: Graph Processing in a Distributed Dataflow Framework," OSDI, Vol.14, pp.599-613, 2014.
  9. S. Salihoglu and J. Widom, "Gps: A graph processing system," In Proc. 25th International Conference on Scientific and Statistical Database Management, ACM, 2013, p.22.
  10. Z. Khayyat, K. Awara, A. Alonazi, H. Jamjoom, D. Williams, and P. Kalnis, "Mizan: a system for dynamic load balancing in large-scale graph processing," In Proc. ACM 8th European Conference on Computer Systems, ACM, 2013, pp. 169-182.
  11. L. G. Valiant, "A bridging model for parallel computation," Communications of the ACM, Vol.33, No.8, pp.103-111, 1990.
  12. Y. Tian, A. Balmin, S. A. Corsten, S. Tatikonda, and J. McPherson, "From think like a vertex to think like a graph," In Proc. VLDB Endowment, Vol.7, No.3, pp.193-204, 2013.
  13. Y. Simmhan, A. Kumbhare, C. Wickramaarachchi, S. Nagarkar, S., Ravi, C., Raghavendra, and V. Prasanna, "Goffish: A sub- graph centric framework for large-scale graph analytics," In Proc. 20th European Conference on Parallel Processing, Springer, Cham, pp. 451-462, 2014.
  14. D. Yan, J., Cheng, Y. Lu, and W. Ng, "Blogel: A block-centric framework for distributed computation on real-world graphs," In Proc. VLDB Endowment, Vol.7, No.14, pp.1981-1992, 2014. https://doi.org/10.14778/2733085.2733103
  15. S. Aridhi, A. Montresor, and Y. Velegrakis, "BLADYG: A novel block-centric framework for the analysis of large dynamic graphs," In Proc. ACM Workshop on High Performance Graph Processing, ACM, pp. 39-42, 2016.
  16. M. R. Garey, D. S. Johnson, and L. Stockmeyer, "Some simplified NP-complete graph problems," Theoretical Computer Science, Vol.1, No.3, pp. 237-267, 1976. https://doi.org/10.1016/0304-3975(76)90059-1
  17. G. Karypis and V. Kumar, "A fast and high quality multilevel scheme for partitioning irregular graphs," SIAM Journal on Scientific Computing, Vol.20, No.1, pp.359-392, 1998. https://doi.org/10.1137/S1064827595287997
  18. P. Sanders and C. Schulz, "Engineering Multilevel Graph Partitioning Algorithms," ESA, Vol.6942, pp.469-480, 2011.
  19. A. J. Soper, C. Walshaw, and M. Cross, "A combined evolutionary search and multilevel optimisation approach to graph-partitioning," Journal of Global Optimization, Vol.29, No.2, pp.225-241, 2004. https://doi.org/10.1023/B:JOGO.0000042115.44455.f3
  20. N. Xu, L. Chen, and B. Cui, "LogGP: a log-based dynamic graph partitioning method," In Proc. VLDB Endowment, Vol.7, No.14, pp. 1917-1928, 2014. https://doi.org/10.14778/2733085.2733097
  21. A. Zheng, A. Labrinidis, and P. K. Chrysanthis, "Planar: Parallel lightweight architecture-aware adaptive graph repartitioning," In Proc. IEEE 32nd International Conference on Data Engineering, IEEE, pp.121-132, 2016.
  22. C. Mayer, M. A. Tariq, C. Li, and K. Rothermel, "Graph: Heterogeneity-aware graph computation with adaptive partitioning," In Proc. IEEE 36th International Conference on Distributed Computing Systems, IEEE, pp.118-128, 2016.
  23. D. Kumar, A. Raj, and J. Dharanipragada, "GraphSteal: Dynamic Re-Partitioning for Efficient Graph Processing in Heterogeneous Clusters," In Proc. IEEE 10th International Conference on Cloud Computing, IEEE, pp.439-446, 2017.
  24. L. M. Vaquero, F. Cuadrado, D. Logothetis, and C. Martella, "Adaptive partitioning for large-scale dynamic graphs," In Proc. IEEE 34th International Conference on Distributed Computing Systems, IEEE, pp. 114-153, 2014.
  25. Pivotal Software, "RabbitMQ-Messaging that just works," Pivotal Software, 2007. [Online]. Available: https://www.rabbitmq.com. [Accessed Dec. 1, 2017].
  26. J. Kunegis, "KONECT - The Koblenz Network Collection," uni-koblenz.de, Apr. 25, 2017. [Online]. Available: http://konect.uni-koblenz.de. [Accessed Dec. 7, 2017]