Browse > Article
http://dx.doi.org/10.4218/etrij.15.2314.0144

Toward High Utilization of Heterogeneous Computing Resources in SNP Detection  

Lim, Myungeun (IT Convergence Technology Research Laboratory, ETRI)
Kim, Minho (IT Convergence Technology Research Laboratory, ETRI)
Jung, Ho-Youl (IT Convergence Technology Research Laboratory, ETRI)
Kim, Dae-Hee (IT Convergence Technology Research Laboratory, ETRI)
Choi, Jae-Hun (IT Convergence Technology Research Laboratory, ETRI)
Choi, Wan (IT Convergence Technology Research Laboratory, ETRI)
Lee, Kyu-Chul (Department of Computer Engineering, Chungnam National University)
Publication Information
ETRI Journal / v.37, no.2, 2015 , pp. 212-221 More about this Journal
Abstract
As the amount of re-sequencing genome data grows, minimizing the execution time of an analysis is required. For this purpose, recent computing systems have been adopting both high-performance coprocessors and host processors. However, there are few applications that efficiently utilize these heterogeneous computing resources. This problem equally refers to the work of single nucleotide polymorphism (SNP) detection, which is one of the bottlenecks in genome data processing. In this paper, we propose a method for speeding up an SNP detection by enhancing the utilization of heterogeneous computing resources often used in recent high-performance computing systems. Through the measurement of workload in the detection procedure, we divide the SNP detection into several task groups suitable for each computing resource. These task groups are scheduled using a window overlapping method. As a result, we improved upon the speedup achieved by previous open source applications by a magnitude of 10.
Keywords
SNP detection; overlapped window; heterogeneous computing resources; CPU-GPU overlapping; multithreading;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 M. Metzker, "Sequencing Technologies - the Next Generation," Nature Rev. Genetics, vol. 11, Jan. 2010, pp. 31-46.   DOI
2 R.M. Durbin et al., "A Map of Human Genome Variation from Population-Scale Sequencing," Nature 467, Oct. 2010, pp. 1061-1073.   DOI
3 F.S. Collins and A.D. Barker, "Mapping the Cancer Genome," Sci. American 296, Mar. 2007, pp. 50-57.
4 O. Harismendy et al., "Evaluation of Next-Generation Sequencing Platforms for Population Targeted Sequencing Studies," Genome Biol., vol. 10, Mar. 2009, pp. R32-R32.13.   DOI
5 J. Wang et al., "The Diploid Genome Sequence of an Asian Individual," Nature 456, Nov. 6, 2009, pp. 60-65.
6 H. Li et al., "The Sequence Alignment/Map (SAM) Format and SAMtools," Bioinformat., vol. 25, no. 16, 2009, pp. 2078-2079.   DOI
7 R. Li et al., "SNP Detection for Massively Parallel Whole-Genome Resequencing," Genome Res., vol. 19, May 2009, pp. 1124-1132.   DOI
8 A. McKenna et al., "The Genome Analysis Toolkit: A Mapreduce Framework for Analyzing Next-Generation DNA Sequencing Data," Genome Res., vol. 20, July 2010, pp. 1297-1303.   DOI
9 M.A. DePristo et al., "A Framework for Variation Discovery and Genotyping Using Next-Generation DNA Sequencing Data," Nature Genetics, vol. 10, Apr. 2011, pp. 491-498.
10 J. Dean and S. Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters," Commun. ACM, no. 51, no. 1, Jan. 2008, pp. 107-113.   DOI
11 D.-H. Ko et al., "Construction and Rendering of Trimmed Blending Surfaces with Sharp Features on a GPU," ETRI J., vol. 33, no. 1, Feb. 2011, pp. 89-98.   DOI
12 S. Kim, M.-H. Kyung, and J.-H. Lee, "Relighting 3D Scenes with a Continuously Moving Camera," ETRI J., vol. 31, no. 4, Aug. 2009, pp. 429-437.   DOI
13 G. Guo et al., "GPU-Accelerated Adaptive Compression Framework for Genomics Data," IEEE Int. Conf. Big Data, Silicon Valley, CA, USA, Oct. 6-9, 2013, pp. 181-186.
14 C. Angermuller, A. Biegert, and J. Soding, "Discriminative Modelling of Context-Specific Amino Acid Substitution Probabilities," Bioinformat., vol. 28, Oct. 2012, pp. 3240-3247.   DOI
15 C.-M. Liu et al., "SOAP3: Ultra-Fast GPU-Based Parallel Alignment Tool for Short Reads," Bioinformat., vol. 28, no. 6, Jan. 2012, pp. 878-879.   DOI
16 A.W. Goetz et al., "Routine Microsecond Molecular Dynamics Simulations with AMBER on GPUs - Part I: Generalized Born," J. Chem. Theory Comput., vol. 8, no. 5, Mar. 2012, pp. 1542-1555.   DOI
17 D.J. Hedges et al., "Exome Sequencing of a Multigenerational Human Pedigree," PLoS ONE, vol. 4, no. 12, Dec. 2009, e8232.   DOI
18 S.T. Sherry et al., "dbSNP: The NCBI Database of Genetic Variation," Nucleic Acid Res., vol. 29, no. 1, 2001, pp. 308-311.   DOI
19 M. Lu et al., "GSNP: A DNA Single-Nucleotide Polymorphism Detection System with GPU Acceleration," Int. Conf. Parallel Process., Taipei, Taiwan, Sept. 13-16, 2011, pp. 592-601.
20 B. Langmead et al., "Searching for SNPs with Cloud Computing," Genome Biol., vol. 10, Nov. 2009, R134.   DOI
21 B. Langmead et al., "Ultrafast and Memory-Efficient Alignment of Short DNA Sequences to the Human Genome," Genome Biol., vol. 10, Mar. 2009, R25.   DOI
22 H. Li and R. Durbin, "Fast and Accurate Short Read Alignment with Burrows-Wheeler Transform," Bioinformat., vol. 25, no. 14, May 2009, pp. 1754-1760.   DOI
23 Picard Project. Accessed June 16, 2014. http://hpicard.sourceforge.net
24 Personal Genome Institute. Accessed July 4, 2014. http://pgi.re.kr
25 P.J.A. Cock et al., "The Sanger FASTQ File Format for Sequences with Quality Scores, and the Solexa/Illumina FASTQ Variants," Nucletic Acids Res., vol. 38, no. 6, 2010, pp. 1767-1771.   DOI
26 Fast, Accurate and Easy Alignment and Variant Calling with Isaac Genome Alignment Software and Isaac Variant Caller, Illumina Inc. Accessed July 4, 2014. http://res.illumina.com/documents/products/hitepapers/whitepaper_iassc_workflow.pdf