• 제목/요약/키워드: Genome Re-sequencing

검색결과 28건 처리시간 0.044초

Toward High Utilization of Heterogeneous Computing Resources in SNP Detection

  • Lim, Myungeun;Kim, Minho;Jung, Ho-Youl;Kim, Dae-Hee;Choi, Jae-Hun;Choi, Wan;Lee, Kyu-Chul
    • ETRI Journal
    • /
    • 제37권2호
    • /
    • pp.212-221
    • /
    • 2015
  • As the amount of re-sequencing genome data grows, minimizing the execution time of an analysis is required. For this purpose, recent computing systems have been adopting both high-performance coprocessors and host processors. However, there are few applications that efficiently utilize these heterogeneous computing resources. This problem equally refers to the work of single nucleotide polymorphism (SNP) detection, which is one of the bottlenecks in genome data processing. In this paper, we propose a method for speeding up an SNP detection by enhancing the utilization of heterogeneous computing resources often used in recent high-performance computing systems. Through the measurement of workload in the detection procedure, we divide the SNP detection into several task groups suitable for each computing resource. These task groups are scheduled using a window overlapping method. As a result, we improved upon the speedup achieved by previous open source applications by a magnitude of 10.

Prediction of Genes Related to Positive Selection Using Whole-Genome Resequencing in Three Commercial Pig Breeds

  • Kim, HyoYoung;Caetano-Anolles, Kelsey;Seo, Minseok;Kwon, Young-jun;Cho, Seoae;Seo, Kangseok;Kim, Heebal
    • Genomics & Informatics
    • /
    • 제13권4호
    • /
    • pp.137-145
    • /
    • 2015
  • Selective sweep can cause genetic differentiation across populations, which allows for the identification of possible causative regions/genes underlying important traits. The pig has experienced a long history of allele frequency changes through artificial selection in the domestication process. We obtained an average of 329,482,871 sequence reads for 24 pigs from three pig breeds: Yorkshire (n = 5), Landrace (n = 13), and Duroc (n = 6). An average read depth of 11.7 was obtained using whole-genome resequencing on an Illumina HiSeq2000 platform. In this study, cross-population extended haplotype homozygosity and cross-population composite likelihood ratio tests were implemented to detect genes experiencing positive selection for the genome-wide resequencing data generated from three commercial pig breeds. In our results, 26, 7, and 14 genes from Yorkshire, Landrace, and Duroc, respectively were detected by two kinds of statistical tests. Significant evidence for positive selection was identified on genes ST6GALNAC2 and EPHX1 in Yorkshire, PARK2 in Landrace, and BMP6, SLA-DQA1, and PRKG1 in Duroc. These genes are reportedly relevant to lactation, reproduction, meat quality, and growth traits. To understand how these single nucleotide polymorphisms (SNPs) related positive selection affect protein function, we analyzed the effect of non-synonymous SNPs. Three SNPs (rs324509622, rs80931851, and rs80937718) in the SLA-DQA1 gene were significant in the enrichment tests, indicating strong evidence for positive selection in Duroc. Our analyses identified genes under positive selection for lactation, reproduction, and meat-quality and growth traits in Yorkshire, Landrace, and Duroc, respectively.

수박계통간 염색체수준의 유전적변이 분석 (Genome-wide analysis of sequence variations in eight inbred watermelon lines)

  • 김윤성;고찬섭;양희범;강순철
    • Journal of Plant Biotechnology
    • /
    • 제43권2호
    • /
    • pp.164-173
    • /
    • 2016
  • 수박의 형태적 변이의 유전적 원인을 분석해 보기 위해 8개 계통에서 re-sequencing을 수행하였다. 유전적 변이의 수는 염색체에 따라 다르게 나왔으며 발견된 SNP의 약 12.9%만이 유전자내에서 발견되었고 나머지는 프로모터나 유전자 사이의 지역에서 발견되었다. SNP 밀도에 대한 분석 결과 염색체 6번의 말단지역에 변이가 집중되어 있는 것을 알 수 있었다. 또한 염색체 10과 11번에 잘 보존된 지역을 발견하였다. Pathway 분석을 통해 DIMBOA(일종의 항생제)-glucoside 분해 대사가 계통간 가장 차이나는 것으로 확인되었으며 이는 각 계통의 병저항성에서 차이가날 가능성을 시사하는 것이다. 당대사 관련 유전자 변이를 분석한 결과 alpha-galactosidase 유전자에 가장 변이가 많은 것으로 밝혀졌다. 이러한 연구 결과는 육종을 분자수준에서 이해하는 데 도움을 줄 것으로 생각한다.

A protein interactions map of multiple organ systems associated with COVID-19 disease

  • Bharne, Dhammapal
    • Genomics & Informatics
    • /
    • 제19권2호
    • /
    • pp.14.1-14.6
    • /
    • 2021
  • Coronavirus disease 2019 (COVID-19) is an on-going pandemic disease infecting millions of people across the globe. Recent reports of reduction in antibody levels and the re-emergence of the disease in recovered patients necessitated the understanding of the pandemic at the core level. The cases of multiple organ failures emphasized the consideration of different organ systems while managing the disease. The present study employed RNA sequencing data to determine the disease associated differentially regulated genes and their related protein interactions in several organ systems. It signified the importance of early diagnosis and treatment of the disease. A map of protein interactions of multiple organ systems was built and uncovered CAV1 and CTNNB1 as the top degree nodes. A core interactions sub-network was analyzed to identify different modules of functional significance. AR, CTNNB1, CAV1, and PIK3R1 proteins were unfolded as bridging nodes interconnecting different modules for the information flow across several pathways. The present study also highlighted some of the druggable targets to analyze in drug re-purposing strategies against the COVID-19 pandemic. Therefore, the protein interactions map and the modular interactions of the differentially regulated genes in the multiple organ systems would incline the scientists and researchers to investigate in novel therapeutics for the COVID-19 pandemic expeditiously.

Bioinformatics services for analyzing massive genomic datasets

  • Ko, Gunhwan;Kim, Pan-Gyu;Cho, Youngbum;Jeong, Seongmun;Kim, Jae-Yoon;Kim, Kyoung Hyoun;Lee, Ho-Yeon;Han, Jiyeon;Yu, Namhee;Ham, Seokjin;Jang, Insoon;Kang, Byunghee;Shin, Sunguk;Kim, Lian;Lee, Seung-Won;Nam, Dougu;Kim, Jihyun F.;Kim, Namshin;Kim, Seon-Young;Lee, Sanghyuk;Roh, Tae-Young;Lee, Byungwook
    • Genomics & Informatics
    • /
    • 제18권1호
    • /
    • pp.8.1-8.10
    • /
    • 2020
  • The explosive growth of next-generation sequencing data has resulted in ultra-large-scale datasets and ensuing computational problems. In Korea, the amount of genomic data has been increasing rapidly in the recent years. Leveraging these big data requires researchers to use large-scale computational resources and analysis pipelines. A promising solution for addressing this computational challenge is cloud computing, where CPUs, memory, storage, and programs are accessible in the form of virtual machines. Here, we present a cloud computing-based system, Bio-Express, that provides user-friendly, cost-effective analysis of massive genomic datasets. Bio-Express is loaded with predefined multi-omics data analysis pipelines, which are divided into genome, transcriptome, epigenome, and metagenome pipelines. Users can employ predefined pipelines or create a new pipeline for analyzing their own omics data. We also developed several web-based services for facilitating downstream analysis of genome data. Bio-Express web service is freely available at https://www. bioexpress.re.kr/.

MYLK Polymorphism Associated with Blood Eosinophil Level among Asthmatic Patients in a Korean Population

  • Lee, Soo Ok;Cheong, Hyun Sub;Park, Byung Lae;Bae, Joon Seol;Sim, Won Chul;Chun, Ji-Yong;Isbat, Mohammad;Uh, Soo-Taek;Kim, Yong Hooun;Jang, An-Soo;Park, Choon-Sik;Shin, Hyoung Doo
    • Molecules and Cells
    • /
    • 제27권2호
    • /
    • pp.175-181
    • /
    • 2009
  • The myosin light chain kinase (MYLK) gene encodes both smooth muscle and nonmuscle cell isoforms. Recently, polymorphisms in MYLK have been reported to be associated with several diseases. To examine the genetic effects of polymorphisms on the risk of asthma and related phenotypes, we scrutinized MYLK by re-sequencing/genotyping and statistical analysis in Korean population (n = 1,015). Seventeen common polymorphisms located in or near exons, having pairwise $r^2$ values less than 0.25, were genotyped. Our statistical analysis did not replicate the associations with the risk of asthma and log-transformed total IgE levels observed among African descendant populations. However, two SNPs in intron 16 (+89872C> G and +92263T> C), which were in tight LD (|D'| = 0.99), revealed significant association with log-transformed blood eosinophil level even after correction multiple testing ($P=0.002/P^{corr}=0.01$ and $P=0.002/P^{corr}=0.01$, respectively). The log-transformed blood eosinophil levels were higher in individuals bearing the minor alleles for +89872C> G and +92263T> C than in those bearing other allele. In additional subgroup analysis, the genetic effects of both SNPs were much more apparent among asthmatic patients and atopic asthma patients. Among atopic asthma patients, the log-transformed blood eosinophil levels were proportionally increased by gene-dose dependent manner of in both +89872C> G and +92263T> C(P = 0.0002 and P = 0.00007, respectively). These findings suggest that MYLK polymorphisms might be among the genetic factors underlying differential increases of blood eosinophil levels among asthmatic patients. Further biological and/or functional studies are needed to confirm our results.

Genome-wide association study of cold stress in rice at early young microspore stage (Oryza sativa L.).

  • Kim, Mijeong;Kim, Taegyu;Lee, Yoonjung;Choi, Jisu;Cho, Giwon;Lee, Joohyun
    • 한국작물학회:학술대회논문집
    • /
    • 한국작물학회 2017년도 9th Asian Crop Science Association conference
    • /
    • pp.313-313
    • /
    • 2017
  • Cold stress is one of the most influenced factors to rice yield. In order to identify genes related to cold stress in fertility stage, genome-wide association study (GWAS) was conducted. Cultivated 129 rice germplasm were moved in the growth chamber under the condition of $12^{\circ}C/RH70%$(12h day/12h night when the rice plant was grown in 10 DBH(days before heading). Also, rice plant as control was moved in the green house under condition of $28^{\circ}C/RH70%$(12h day/12h night). After 4 days the plants were moved in a greenhouse. The fertility of rice plant were monitored after the grain were fully grown. The most tolerant rice germplasm to cold stress were Cheongdo-Hwayang-12 and IR38 as 63.1 and 61.8 of fertility and the most recessive rice germplasm were Danyang38 and 8 rice germplasm as 0. As a result of GWAS with re-sequencing data and fertility after cold treatment germplasm using genome association and prediction integrated tool (GAPIT), 99 single-nucleotide polymorphisms (SNPs) were observed by applying a significance threshold of -logP>4.5 determined by QQ plot. With SNPs region, 14 candidate genes responded to cold stress in fertility stage were identified.

  • PDF

Identification of plasma miRNA biomarkers for pregnancy detection in dairy cattle

  • Lim, Hyun-Joo;Kim, Hyun Jong;Lee, Ji Hwan;Lim, Dong Hyun;Son, Jun Kyu;Kim, Eun-Tae;Jang, Gulwon;Kim, Dong-Hyeon
    • 한국동물생명공학회지
    • /
    • 제36권1호
    • /
    • pp.35-44
    • /
    • 2021
  • A pregnancy diagnosis is an important standard for control of livestock's reproduction in paricular dairy cattle. High reproductive performance in dairy animals is a essential condition to realize of high life-time production. Pregnancy diagnosis is crucial to shortening the calving interval by enabling the farmer to identify open animals so as to treat or re-breed them at the earliest opportunity. MicroRNAs are short RNA molecules which are critically involved in regulating gene expression during both health and disease. This study is sought to establish the feasible of circulating miRNAs as biomarkers of early pregnancy in cattle. We applied Illumina small-RNA sequencing to profile miRNAs in plasma samples collected from 12 non-pregnant cows ("open" cows: samples were collected before insemination (non-pregnant state) and after pregnancy check at the indicated time points) on weeks 0, 4, 8, 12 and 16. Using small RNA sequencing we identified a total of 115 miRNAs that were differentially expressed weeks 16 relative to non-pregnancy ("open" cows). Weeks 8, 12 and 16 of pregnancy commonly showed a distinct increase in circulating levels of miR-221 and miR-320a. Through genome-wide analyses we have successfully profiled plasma miRNA populations associated with pregnancy in cattle. Their application in the field of reproductive biology has opened up opportunities for research communities to look for pregnancy biomarker molecules in dairy cattle.

Dynamics of Viral and Host 3D Genome Structure upon Infection

  • Meyer J. Friedman;Haram Lee;Young-Chan Kwon;Soohwan Oh
    • Journal of Microbiology and Biotechnology
    • /
    • 제32권12호
    • /
    • pp.1515-1526
    • /
    • 2022
  • Eukaryotic chromatin is highly organized in the 3D nuclear space and dynamically regulated in response to environmental stimuli. This genomic organization is arranged in a hierarchical fashion to support various cellular functions, including transcriptional regulation of gene expression. Like other host cellular mechanisms, viral pathogens utilize and modulate host chromatin architecture and its regulatory machinery to control features of their life cycle, such as lytic versus latent status. Combined with previous research focusing on individual loci, recent global genomic studies employing conformational assays coupled with high-throughput sequencing technology have informed models for host and, in some cases, viral 3D chromosomal structure re-organization during infection and the contribution of these alterations to virus-mediated diseases. Here, we review recent discoveries and progress in host and viral chromatin structural dynamics during infection, focusing on a subset of DNA (human herpesviruses and HPV) as well as RNA (HIV, influenza virus and SARS-CoV-2) viruses. An understanding of how host and viral genomic structure affect gene expression in both contexts and ultimately viral pathogenesis can facilitate the development of novel therapeutic strategies.