DOI QR코드

DOI QR Code

Parallelization and Performance Optimization of the Boyer-Moore Algorithm on GPU

Boyer-Moore 알고리즘을 위한 GPU상에서의 병렬 최적화

  • 정요상 (명지대학교 컴퓨터공학과) ;
  • 쟌느앗-프엉 (명지대학교 컴퓨터공학과) ;
  • 이명호 (명지대학교 컴퓨터공학과) ;
  • 남덕윤 (한국과학기술정보연구원 슈퍼컴퓨팅기술개발실) ;
  • 김직수 (한국과학기술정보연구원 슈퍼컴퓨팅기술개발실) ;
  • 황순욱 (한국과학기술정보연구원 슈퍼컴퓨팅기술개발실)
  • Received : 2014.09.11
  • Accepted : 2014.12.01
  • Published : 2015.02.15

Abstract

The Boyer-Moore algorithm is a single pattern string matching algorithm that is widely used in various applications such as computer and internet security, and bioinformatics. This algorithm is computationally demanding and requires high-performance parallel processing. In this paper, we propose a parallelization and performance optimization methodology for the BM algorithm on a GPU. Our methodology adopts an algorithmic cascading technique. This results in significant reductions in the mapping overheads for the threads participating in the parallel string matching. It also results in the efficient utilization of the multithreading capability of the GPU which improves the load balancing among threads. Our experimental results show that this approach achieves a 45-times speedup at maximum, in comparison with a serial execution.

Boyer-Moore 알고리즘은 컴퓨터 및 인터넷 보안, 바이오 인포매틱스 등의 응용프로그램에서 널리 활용되는 패턴매칭 알고리즘이다. 이 알고리즘은 방대한 양의 입력 데이터에 존재하는 특정한 하나의 패턴을 실시간에 검색해야하는 높은 계산 요구량으로 인하여 병렬 처리 및 성능 최적화가 필수적이다. 본 논문에서는 GPU를 활용하여 BM 알고리즘을 병렬 최적화하는 방법론을 제안한다. 방법론에 따라 알고리즘 cascading 기법을 적용하여 실행시간에 소요되는 매핑 오버헤드를 최소화하고, 멀티스레딩 효과를 극대화하여 스레드들간의 부하 부산을 향상시킴으로써 순차실행 대비 최대 45배의 성능향상을 얻었다.

Keywords

Acknowledgement

Grant : 사용자 환경개선을 위한 초고성능 플랫폼 개발

References

  1. Rao, Someswara, K. Butchi Raju, and Chinta S Viswanadha Raju, "Parallel String Matching with Multi Core Processors-A Comparative Study for Gene Sequences," Global Journal of Computer Science and Technology, 13.1 (2013).
  2. Kouzinopoulos, Charalampos S., and Konstantinos G. Margaritis, "String Matching on a multicore GPU using CUDA," Informatics, 2009. PCI'09. 13th Panhellenic Conference on. IEEE, 2009.
  3. Zhou, Junrui, et al., "Implementation of String Match Algorithm BMH on GPU Using CUDA," Energy Procedia, 13, pp. 1853-1861, 2011. https://doi.org/10.1016/S1876-6102(14)00454-8
  4. Boyer, Robert S., and J. Strother Moore, "A fast string searching algorithm," Communications of the ACM, 20.10, pp. 762-772, 1977. https://doi.org/10.1145/359842.359859
  5. Cole, Richard, "Tight bounds on the complexity of the Boyer-Moore string matching algorithm," SIAM Journal on Computing, 23.5, pp. 1075-1091. https://doi.org/10.1137/S0097539791195543
  6. Galil, Zvi, "On improving the worst case running time of the Boyer-Moore string matching algorithm," Communications of the ACM, 22.9, pp. 505-508, 1979. https://doi.org/10.1145/359146.359148
  7. Harris, Mark, "Optimizing parallel reduction in CUDA," NVIDIA Developer Technology, 2.4 (2007), [http://docs.nvidia.com/cuda/samples/6_Advanced/reduction/doc/reduction.pdf]
  8. Martín, Pedro J., et al., "Algorithmic strategies for optimizing the parallel reduction primitive in CUDA," IEEE International Conference on High Performance Computing and Simulation (HPCS), 2012.
  9. NVIDIA, "CUDA C Programming Guides," [http://docs.nvidia.com/cuda/index.html#programming-guides]
  10. NVIDIA, "CUDA TOOLKIT V6.0," [http://docs.nvidia.com/cuda/pdf/CUDA_Toolkit_Release_Notes.pdf]