• 제목/요약/키워드: Copy number variation

검색결과 62건 처리시간 0.03초

맵리듀스 기반의 암 특이적 유전자 단위 반복 변이 추출 (Highly accurate detection of cancer-specific copy number variations with MapReduce)

  • 신재문;홍상균;이은주;윤지희
    • 한국정보과학회:학술대회논문집
    • /
    • 한국정보과학회 2012년도 한국컴퓨터종합학술대회논문집 Vol.39 No.1(C)
    • /
    • pp.19-21
    • /
    • 2012
  • 모든 암 세포는 체세포 변이를 동반한다. 따라서 암 유전체 변이 분석에 의하여 암을 발생시키는 유전자 및 진단/치료법을 찾아낼 수 있다. 본 연구에서는 차세대 시퀀싱 데이터를 이용하여 암 특이적 단이 반복 변이(copy number variation, CNV) 유형을 밝히는 새로운 알고리즘을 제안한다. 제안하는 방식은 암 환자의 정상 세포와 암세포로부터 얻어진 정상 유전체와 암 유전체를 동시 분석하여 각각 CNV 후보 영역을 추출하며, 통계적 유의성 분석을 통하여 암 특이적 CNV 후보 영역을 선별하고, 다음 후처리 과정에서 참조 표준 서열(reference sequence)에 존재하는 오류 영역 보정 작업을 수행하여 정확한 암 특이적 CNV 영역을 추출해 낸다. 또한 다수의 대용량 유전체 데이터 동시 분석을 위하여 맵리듀스(MapReduce) 기법을 기반으로 하는 병렬 수행 알고리즘을 제안한다.

담수환경에서 eDNA와 eRNA를 이용한 Microcystin 합성 남조류 탐색 및 세포 내 Microcystin 생합성 활성 변화 (Detection of Microcystin Synthetic Cyanobacteria and Variation of Intracellular Microcystin Synthesis Using by eDNA and eRNA in Freshwater Ecocystem)

  • 김건희;박채홍;조현진;권대률;황순진
    • 생태와환경
    • /
    • 제56권1호
    • /
    • pp.1-13
    • /
    • 2023
  • 북한강 수역에서 가장 많이 검출되는 Microcystin (MC)을 대상으로 하여 MC 생합성 유전자(mcyA gene), 남조류 세포밀도, MC 농도 사이의 관계를 분석하여 RNA-MC 환산식을 도출하고 남조류 세포 내 존재하는 MC 농도를 예측하였다. 북한강 수역에서 mcyA 유전자는 묵현천 합류 이후 북한강 하류 지점에서 주로 발견되었으며 평균적으로 다른 지점보다 높은 copy number가 발견되었다. 북한강 상류 의암호 수역의 경우, 공지천 지점에서 mcyA 유전자 copy number가 증가하였으며 9월 이후 북한강 수역 전체에서 mcyA 유전자 copy number는 감소하였다. mcyA gene expression은 상류와 하류 수역의 시·공간적 차이가 존재하였으며 여름철 짧은 시기에 집중적으로 발현하였다. mcyA gene expression 양은 MC 농도와 상관성이 매우 높을 뿐만 아니라 MC을 생합성하는 것으로 알려진 Microcystis aeruginosa와 Dolichospermum circinale의 세포밀도와도 통계적으로 유의한 상관성이 존재하였다. RNA-MC 관계를 기반으로 도출된 6개의 환산식은 통계적 유의성을 보이며(p<0.05) 0.9 이상의 높은 상관계수(r)를 나타냈다. eRNA에 존재하는 MC 생합성 유전자 발현량은 수중의 남조독소 물질 합성을 판단하고 유전자의 활성 정도를 빠르게 정량하여 MC 발생 조기경보에 충분히 활용할 수 있을 것으로 판단된다.

Identification of copy number variations using high density whole-genome single nucleotide polymorphism markers in Chinese Dongxiang spotted pigs

  • Wang, Chengbin;Chen, Hao;Wang, Xiaopeng;Wu, Zhongping;Liu, Weiwei;Guo, Yuanmei;Ren, Jun;Ding, Nengshui
    • Asian-Australasian Journal of Animal Sciences
    • /
    • 제32권12호
    • /
    • pp.1809-1815
    • /
    • 2019
  • Objective: Copy number variations (CNVs) are a major source of genetic diversity complementary to single nucleotide polymorphism (SNP) in animals. The aim of the study was to perform a comprehensive genomic analysis of CNVs based on high density whole-genome SNP markers in Chinese Dongxiang spotted pigs. Methods: We used customized Affymetrix Axiom Pig1.4M array plates containing 1.4 million SNPs and the PennCNV algorithm to identify porcine CNVs on autosomes in Chinese Dongxiang spotted pigs. Then, the next generation sequence data was used to confirm the detected CNVs. Next, functional analysis was performed for gene contents in copy number variation regions (CNVRs). In addition, we compared the identified CNVRs with those reported ones and quantitative trait loci (QTL) in the pig QTL database. Results: We identified 871 putative CNVs belonging to 2,221 CNVRs on 17 autosomes. We further discarded CNVRs that were detected only in one individual, leaving us 166 CNVRs in total. The 166 CNVRs ranged from 2.89 kb to 617.53 kb with a mean value of 93.65 kb and a genome coverage of 15.55 Mb, corresponding to 0.58% of the pig genome. A total of 119 (71.69%) of the identified CNVRs were confirmed by next generation sequence data. Moreover, functional annotation showed that these CNVRs are involved in a variety of molecular functions. More than half (56.63%) of the CNVRs (n = 94) have been reported in previous studies, while 72 CNVRs are reported for the first time. In addition, 162 (97.59%) CNVRs were found to overlap with 2,765 previously reported QTLs affecting 378 phenotypic traits. Conclusion: The findings improve the catalog of pig CNVs and provide insights and novel molecular markers for further genetic analyses of Chinese indigenous pigs.

정렬된 리드의 통계적 분석을 기반으로 하는 CNV 검색 알고리즘 (A CNV detection algorithm based on statistical analysis of the aligned reads)

  • 홍상균;홍동완;윤지희;김백섭;박상현
    • 정보처리학회논문지D
    • /
    • 제16D권5호
    • /
    • pp.661-672
    • /
    • 2009
  • 인간의 유전체 서열에는 유전체 단위반복변위(copy number variation, CNV)를 포함하는 다양한 유전적 구조 변이(genetic structural variation)가 존재하며, 이는 기능적으로 질병에 대한 감수성, 치료에 대한 반응, 유전적 특성 등과 밀접한 관련이 있다. 본 논문에서는 기가 시퀀싱(giga sequencing)의 결과 산출되는 대량의 짧은 길이의 DNA 서열 데이터를 이용한 새로운 CNV 검색 방식을 제안한다. 제안하는 알고리즘에서는 레퍼런스 시퀀스에 DNA 서열 데이터를 서열 정렬시켜 각 레퍼런스 시퀀스의 위치에 대한 서열 데이터의 출현 빈도 정보를 얻은 후, 출현 빈도 정보의 패턴을 분석하여 통계적 유의성을 갖는 1kbp 이상의 연속 영역을 CNV 후보 영역으로 추출한다. 또한 제안된 알고리즘을 효율적으로 지원하기 위한 서열 정렬 방식에 대한 비교 및 분석을 수행한다. 제안된 기법의 유용성을 규명하기 위하여 다양한 실험을 수행하였다. 실험 결과에 의하면, 제안된 기법은 비교적 낮은 커버리지의 기가 시퀀싱 데이터를 이용하여 반복되거나 결실되는 다양한 형태의 CNV 영역을 효율적으로 검출하며, 또한 작은 사이즈의 CNV 영역에서부터 큰 사이즈의 CNV 영역까지 다양한 크기의 CNV 영역을 효율적으로 검출 할 수 있는 것으로 나타났다.

A Genome-Wide Study of Moyamoya-Type Cerebrovascular Disease in the Korean Population

  • Joo, Sung-Pil;Kim, Tae-Sun;Lee, Il-Kwon;Kim, Joon-Tae;Park, Man-Seok;Cho, Ki-Hyun
    • Journal of Korean Neurosurgical Society
    • /
    • 제50권6호
    • /
    • pp.486-491
    • /
    • 2011
  • Objective : Structural genetic variation, including copy-number variation (CNV), constitutes a substantial fraction of total genetic variability, and the importance of structural variants in modulating susceptibility is increasingly being recognized. CNV can change biological function and contribute to pathophysiological conditions of human disease. Its relationship with common, complex human disease in particular is not fully understood. Here, we searched the human genome to identify copy number variants that predispose to moya-moya type cerebrovascular disease. Methods : We retrospectively analyzed patients who had unilateral or bilateral steno-occlusive lesions at the cerebral artery from March, 2007, to September, 2009. For the 20 subjects, including patients with moyamoya type pathologies and three normal healthy controls, we divided the subjects into 4 groups : typical moyamoya (n=6), unilateral moyamoya (n=9), progression unilateral to typical moyamoya (n=2) and non-moyamoya (n=3). Fragmented DNA was hybridized on Human610Quad v1.0 DNA analysis BeadChips (Illumina). Data analysis was performed with GenomeStudio v2009.1, Genotyping 1.1.9, cnvPartition_v2.3.4 software. Overall call rates were more than 99.8%. Results : In total, 1258 CNVs were identified across the whole genome. The average number of CNV was 45.55 per subject (CNV region was 45.4). The gain/loss of CNV was 52/249, having 4.7 fold higher frequencies in loss calls. The total CNV size was 904,657,868, and average size was 993,038. The largest portion of CNVs (613 calls) were 1M-10M in length. Interestingly, significant association between unilateral moyamoya disease (MMD) and progression of unilateral to typical moyamoya was observed. Conclusion : Significant association between unilateral MMD and progression of unilateral to typical moyamoya was observed. The finding was confirmed again with clustering analysis. These data demonstrate that certain CNV associate with moyamoya-type cerebrovascular disease.

Identification of CNVs and their association with the meat traits of Hanwoo

  • Chan Mi Bang;Khaliunaa Tseveen;Gwang Hyeon Lee;Gil Jong Seo;Hong Sik Kong
    • 한국동물생명공학회지
    • /
    • 제38권3호
    • /
    • pp.158-166
    • /
    • 2023
  • Background: Copy number variation (CNV) can be identified using next-generation sequencing and microarray technologies, the research on the analysis of its association with meat traits in livestock breeding has significantly increased in recent years. Hanwoo is an inherent species raised in the Republic of Korea. It is now considered one of the most economically important species and a major food source mainly used for meat (Hanwoo beef). Methods: In this study, CNVs and the relationship between the obtained CNV regions (CNVRs) can be identified in the Hanwoo steer samples (n = 473) using Illumina Hanwoo SNP 50K bead chip and bioinformatic tools, which were used to locate the required data and meat traits were investigated. The PennCNV software was used for the identification of CNVs, followed by the use of the CNV Ruler software for locating the different CNVRs. Furthermore, bioinformatics analysis was performed. Results: We found a total of 2,575 autosomal CNVs (933 losses, 1,642 gains) and 416 CNVRs (289 gains, 111 losses, and 16 mixed), which were established with ranged in size from 2,183 bp to 983,333 bp and 10,004 bp to 381,836 bp, respectively. Upon analyzing the restriction of minor alleles frequency > 0.05 for meat traits association, 6 CNVRs in the carcass weight, 2 CNVRs in the marbling score, 3 CNVRs in the backfat thickness, and 2 CNVRs in the longissimus muscle area were related to the meat traits. In addition, we identified an overlap of 347 CNVRs. Moreover, 3 CNVRs were determined to have a gene that affects meat quality. Conclusions: Our results confirmed the relationship between Hanwoo CNVR and meat traits, and the possibility of overlapping candidate genes, annotations, and quantitative trait loci that results depended on to contribute to the greater understanding of CNVs in Hanwoo and its role in genetic variation among cattle livestock.

CNVDAT : 차세대 시퀀싱 데이터를 위한 유전체 단위 반복 변이 검출 및 분석 도구 (CNVDAT: A Copy Number Variation Detection and Analysis Tool for Next-generation Sequencing Data)

  • 강인호;공진화;신재문;이은주;윤지희
    • 한국정보과학회논문지:데이타베이스
    • /
    • 제41권4호
    • /
    • pp.249-255
    • /
    • 2014
  • 유전체 단위 반복 변이(CNV)는 유전적 구조변이의 하나로서, 암을 포함하는 인간의 질병과 밀접한 연관성이 있는 것으로 알려져 있다. 암 유전자를 규명하기 위하여, 연구자는 특정 암 환자의 대규모 유전체 데이터를 분석하여 CNV를 찾아내야하며, 동시에 대규모 유전/임상 데이터를 연계 분석하여야 한다. 본 연구는 NGS 데이터로부터 CNV를 추출하고, 추출된 CNV와 관련된 유전/임상 정보를 체계적으로 연계 분석하는 기능을 제공하는 새로운 분석 툴 CNVDAT를 제안한다. CNV 추출 모듈은 스케일 스페이스 필터링 기법을 이용하여 CNV를 추출하며, 리드 데이터에 잡음이 포함된 경우에도 CNV의 타입/위치를 정확히 추출해낸다. 또한 시퀀스 분석 모듈은 변이 영역의 브라우징 및 상호 비교를 지원하는 사용자 친화적 프로그램으로서, 암/정상 샘플의 변이 영역의 동시 분석 기능과 refGene, OMIM DB를 기반으로 하는 CNV-유전자-표현형 매핑의 연관성 분석 기능을 제공한다. 본 프로그램의 소스 코드와 샘플프로그램은 http://dblab.hallym.ac.kr/CNVDAT/에서 다운 받을 수 있다.

Identification of a Copy Number Variation on Chromosome 20q13.12 Associated with Osteoporotic Fractures in the Korean Population

  • Park, Tae-Joon;Hwang, Mi Yeong;Moon, Sanghoon;Hwang, Joo-Yeon;Go, Min Jin;Kim, Bong-Jo
    • Genomics & Informatics
    • /
    • 제14권4호
    • /
    • pp.216-221
    • /
    • 2016
  • Osteoporotic fractures (OFs) are critical hard outcomes of osteoporosis and are characterized by decreased bone strength induced by low bone density and microarchitectural deterioration in bone tissue. Most OFs cause acute pain, hospitalization, immobilization, and slow recovery in patients and are associated with increased mortality. A variety of genetic studies have suggested associations of genetic variants with the risk of OF. Genome-wide association studies have reported various single-nucleotide polymorphisms and copy number variations (CNVs) in European and Asian populations. To identify CNV regions associated with OF risk, we conducted a genome-wide CNV study in a Korean population. We performed logistic regression analyses in 1,537 Korean subjects (299 OF cases and 1,238 healthy controls) and identified a total of 8 CNV regions significantly associated with OF (p < 0.05). Then, one CNV region located on chromosome 20q13.12 was selected for experimental validation. The selected CNV region was experimentally validated by quantitative polymerase chain reaction. The CNV region of chromosome 20q13.12 is positioned upstream of a family of long non-coding RNAs, LINC01260. Our findings could provide new information on the genetic factors associated with the risk of OF.

Comparison of Normalization Methods for Defining Copy Number Variation Using Whole-genome SNP Genotyping Data

  • Kim, Ji-Hong;Yim, Seon-Hee;Jeong, Yong-Bok;Jung, Seong-Hyun;Xu, Hai-Dong;Shin, Seung-Hun;Chung, Yeun-Jun
    • Genomics & Informatics
    • /
    • 제6권4호
    • /
    • pp.231-234
    • /
    • 2008
  • Precise and reliable identification of CNV is still important to fully understand the effect of CNV on genetic diversity and background of complex diseases. SNP marker has been used frequently to detect CNVs, but the analysis of SNP chip data for identifying CNV has not been well established. We compared various normalization methods for CNV analysis and suggest optimal normalization procedure for reliable CNV call. Four normal Koreans and NA10851 HapMap male samples were genotyped using Affymetrix Genome-Wide Human SNP array 5.0. We evaluated the effect of median and quantile normalization to find the optimal normalization for CNV detection based on SNP array data. We also explored the effect of Robust Multichip Average (RMA) background correction for each normalization process. In total, the following 4 combinations of normalization were tried: 1) Median normalization without RMA background correction, 2) Quantile normalization without RMA background correction, 3) Median normalization with RMA background correction, and 4) Quantile normalization with RMA background correction. CNV was called using SW-ARRAY algorithm. We applied 4 different combinations of normalization and compared the effect using intensity ratio profile, box plot, and MA plot. When we applied median and quantile normalizations without RMA background correction, both methods showed similar normalization effect and the final CNV calls were also similar in terms of number and size. In both median and quantile normalizations, RMA backgroundcorrection resulted in widening the range of intensity ratio distribution, which may suggest that RMA background correction may help to detect more CNVs compared to no correction.

Plant genome analysis using flow cytometry

  • 이재헌;김기영;정대수;정원복;권오창
    • 한국작물학회:학술대회논문집
    • /
    • 한국작물학회 1999년도 춘계 학술대회지
    • /
    • pp.162-163
    • /
    • 1999
  • The goal of this research was (1) to describe the conditions and parameters required for the cell cycle synchronization and the accumulation of large number of metaphase cells in maize and other cereal root tips, (2) to isolate intact metaphase chromosomes from root tips suitable for characterization by flow cytometry, and (3) to construct chromosome-specific libraries from maize. Plant metaphase chromosomes have been successfully synchronized and isolated from many cereal root-tips. DNA synthesis inhibitor (hydroxyurea) was used to synchronize cell cycle, follwed by treatement with trifluralin to accumulate metaphase chromosomes. Maize flow karyotypes show substantial variation among inbred lines. thish variation should be sueful in isolating individual chromosome types. In addition, flow cytometry is a useful method to measure DNA content of individual chromosomes in a genotyps, and to detect chromosomal variations. Individual chromosome peaks have been sorted from the maize hybrid B73/Mol7. Libraries were generated form the DOP-PCR amplification product from each peak. To date, we have analyzed clones from a library constructed from the maize chromosome 1 peak. Hybridization of labeled genomic DNA to clone inserts indicated that $24\%,\;18\%,\;and\;58\%$ of the clones were highly repetitive, medium repetitive, and low copy, respectively. Fifty percent of putative low cpoy clones showed single bands on inbred screening, blots, and the remaining $50\%$ were low copy repeats. Single copy clones showing polymorphism will be mapped using recombinant inbred mapping populations. Repetitive clones are being characterized by Southern blot analysis, and will be screened by in situ hybridization for their potential utility as chromosome specific markers.

  • PDF