Browse > Article

CNVR Detection Reflecting the Properties of the Reference Sequence in HLA Region  

Lee, Jong-Keun (한림대학교 컴퓨터공학과)
Hong, Dong-Wan (한림대학교 컴퓨터공학과)
Yoon, Jee-Hee (한림대학교 컴퓨터공학과)
Abstract
In this paper, we propose a novel shape-based approach to detect CNV regions (CNVR) by analyzing the coverage graph obtained by aligning the giga-sequencing data onto the human reference sequence. The proposed algorithm proceeds in two steps: a filtering step and a post-processing step. In the filtering step, it takes several shape parameters as input and extracts candidate CNVRs having various depth and width. In the post-processing step, it revises the candidate regions to make up for errors potentially included in the reference sequence and giga-sequencing data, and filters out regions with high ratio of GC-contents, and returns the final result set from those candidate CNVRs. To verify the superiority of our approach, we performed extensive experiments using giga-sequencing data publicly opened by "1000 genome project" and verified the accuracy by comparing our results with those registered in DGV database. The result revealed that our approach successfully finds the CNVR having various shapes (gains or losses) in HLA (Human Leukocyte Antigen) region.
Keywords
Giga-sequencing; Copy Number Variation; Shape-based extraction;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Chiang et al., "High-resolution mapping of copynumber alterations with massively parallel sequencing," Nature Methods, vol.6, no.1, pp.99-103, 2009.   DOI   ScienceOn
2 서을주, "Copy number variants (CNV)의 분석 방법," Korean Society of Medical Biochemistry and Molecular Biology, vol.15, no.3, pp.28-39, 2008.
3 홍상균, 홍동완, 윤지희, 김종일, "짧은 리드의 서열 정렬에 의한 CNV 영역 추출", 데이터베이스연구, vol.24, no.3, pp.1-13, 2008.
4 Lai et al., "Comparative analysis of algorithms for identifying amplifications and deletions in array CGH data," Bioinformatics, vol.21, no.19, pp.3763-3770, 2005.   DOI   ScienceOn
5 Scherer et al., "Challenges and standards in integrating surveys of structural variation," Nature Genetics, vol.39, no.7, pp.S7-S15, 2007.   DOI   ScienceOn
6 C. Xie and M. T. Tammi, "CNV-seq, a new method to detect copy number variation using highthroughput sequencing," BioMed Central Bioinformatics, vol.10, no.1, 2009.
7 박종화, "Bioinformatics Tools for Variome Study," Medical Postgraduates, vol.37, no.3, pp.131-133, 2009.
8 Li et al., "SOAP2: an improved ultrafast tool for short read alignment," Bioinformatics, vol.25, no.15, pp.1966-1967, 2009.   DOI   ScienceOn
9 Redon et al., "Global variation in copy number in the human genome," Nature, vol.444, pp.444-454, 2006.   DOI   ScienceOn
10 Smith et al., "Rapid whole-genome mutational profiling using next-generation sequencing technologies," Genome Research, vol.18, no.10, pp.1638-1642, 2008.   DOI   ScienceOn
11 http://projects.tcag.ca/variation/
12 http://www.1000genomes.org/
13 http://www.ncbi.nlm.nih.gov/projects/genome/assembly/grc/