• Title/Summary/Keyword: Whole Genome Association

Search Result 82, Processing Time 0.043 seconds

Application of genotyping-by-sequencing (GBS) in plant genome using bioinformatics pipeline

  • Lee, Yun Gyeong;Kang, Chon-Sik;Kim, Changsoo
    • Proceedings of the Korean Society of Crop Science Conference
    • /
    • 2017.06a
    • /
    • pp.58-58
    • /
    • 2017
  • The advent of next generation sequencing technology has elicited plenty of sequencing data available in agriculturally relevant plant species. For most crop species, it is too expensive to obtain the whole genome sequence data with sufficient coverage. Thus, many approaches have been developed to bring down the cost of NGS. Genotyping-by-sequencing (GBS) is a cost-effective genotyping method for complex genetic populations. GBS can be used for the analysis of genomic selection (GS), genome-wide association study (GWAS) and constructing haplotype and genetic linkage maps in a variety of plant species. For efficiently dealing with plant GBS data, the TASSEL-GBS pipeline is one of the most popular choices for many researchers. TASSEL-GBS is JAVA based a software package to obtain genotyping data from raw GBS sequences. Here, we describe application of GBS and bioinformatics pipeline of TASSEL-GBS for analyzing plant genetics data.

  • PDF

Genome-Wide SNP Calling Using Next Generation Sequencing Data in Tomato

  • Kim, Ji-Eun;Oh, Sang-Keun;Lee, Jeong-Hee;Lee, Bo-Mi;Jo, Sung-Hwan
    • Molecules and Cells
    • /
    • v.37 no.1
    • /
    • pp.36-42
    • /
    • 2014
  • The tomato (Solanum lycopersicum L.) is a model plant for genome research in Solanaceae, as well as for studying crop breeding. Genome-wide single nucleotide polymorphisms (SNPs) are a valuable resource in genetic research and breeding. However, to do discovery of genome-wide SNPs, most methods require expensive high-depth sequencing. Here, we describe a method for SNP calling using a modified version of SAMtools that improved its sensitivity. We analyzed 90 Gb of raw sequence data from next-generation sequencing of two resequencing and seven transcriptome data sets from several tomato accessions. Our study identified 4,812,432 non-redundant SNPs. Moreover, the workflow of SNP calling was improved by aligning the reference genome with its own raw data. Using this approach, 131,785 SNPs were discovered from transcriptome data of seven accessions. In addition, 4,680,647 SNPs were identified from the genome of S. pimpinellifolium, which are 60 times more than 71,637 of the PI212816 transcriptome. SNP distribution was compared between the whole genome and transcriptome of S. pimpinellifolium. Moreover, we surveyed the location of SNPs within genic and intergenic regions. Our results indicated that the sufficient genome-wide SNP markers and very sensitive SNP calling method allow for application of marker assisted breeding and genome-wide association studies.

Development and Application of High-density SNP Arrays in Genomic Studies of Domestic Animals

  • Fan, Bin;Du, Zhi-Qiang;Gorbach, Danielle M.;Rothschild, Max F.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.23 no.7
    • /
    • pp.833-847
    • /
    • 2010
  • In the past decade, there have been many advances in whole-genome sequencing in domestic animals, as well as the development of "next-generation" sequencing technologies and high-throughput genotyping platforms. Consequently, these advances have led to the creation of the high-density SNP array as a state-of-the-art tool for genetics and genomics analyses of domestic animals. The emergence and utilization of SNP arrays will have significant impacts not only on the scale, speed, and expense of SNP genotyping, but also on theoretical and applied studies of quantitative genetics, population genetics and molecular evolution. The most promising applications in agriculture could be genome-wide association studies (GWAS) and genomic selection for the improvement of economically important traits. However, some challenges still face these applications, such as incorporating linkage disequilibrium (LD) information from HapMap projects, data storage, and especially appropriate statistical analyses on the high-dimensional, structured genomics data. More efforts are still needed to make better use of the high-density SNP arrays in both academic studies and industrial applications.

A Whole Genome Association Study to Detect Single Nucleotide Polymorphisms for Carcass Traits in Hanwoo Populations

  • Lee, Y.-M.;Han, C.-M.;Li, Yi;Lee, J.-J.;Kim, L.H.;Kim, J.-H.;Kim, D.-I.;Lee, S.-S.;Park, B.-L.;Shin, H.-D.;Kim, K.-S.;Kim, N.-S.;Kim, Jong-Joo
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.23 no.4
    • /
    • pp.417-424
    • /
    • 2010
  • The purpose of this study was to detect significant SNPs for carcass quality traits using DNA chips of high SNP density in Hanwoo populations. Carcass data of two hundred and eighty nine steers sired by 30 Korean proven sires were collected from two regions; the Hanwoo Improvement Center of National Agricultural Cooperative Federation in Seosan, Chungnam province and the commercial farms in Gyeongbuk province. The steers in Seosan were born between spring and fall of 2006 and those in Gyeonbuk between falls of 2004 and 2005. The former steers were slaughtered at approximately 24 months, while the latter steers were fed six months longer before slaughter. Among the 55,074 SNPs in the Illumina bovine 50K chip, a total of 32,756 available SNPs were selected for whole genome association study. After adjusting for the effects of sire, region and slaughter age, phenotypes were regressed on each SNP using a simple linear regression model. For the significance threshold, 0.1% point-wise p value from F distribution was used for each SNP test. Among the significant SNPs for a trait, the best set of SNP markers were selected using a stepwise regression procedure, and inclusion and exclusion of each SNP out of the model was determined at the p<0.001 level. A total of 118 SNPs were detected; 15, 20, 22, 28, 20, and 13 SNPs for final weight before slaughter, carcass weight, backfat thickness, weight index, longissimus dorsi muscle area, and marbling score, respectively. Among the significant SNPs, the best set of 44 SNPs was determined by stepwise regression procedures with 7, 9, 6, 9, 7, and 6 SNPs for the respective traits. Each set of SNPs per trait explained 20-40% of phenotypic variance. The number of detected SNPs per trait was not great in whole genome association tests, suggesting additional phenotype and genotype data are required to get more power to detect the trait-related SNPs with high accuracy for estimation of the SNP effect. These SNP markers could be applied to commercial Hanwoo populations via marker-assisted selection to verify the SNP effects and to improve genetic potentials in successive generations of the Hanwoo populations.

Identify Major Gene-Gene Interaction Effects Using SNPHarvester (SNPHarvester를 활용한 주요 유전자 상호작용 효과 감명)

  • Lee, Jea-Young;Kim, Dong-Chul
    • Communications for Statistical Applications and Methods
    • /
    • v.16 no.6
    • /
    • pp.915-923
    • /
    • 2009
  • The gene which is related in the disease of the human has been searched among numerous genes in GWA(Genome-Wide Association) research. However, most current statistical methods used to detect gene-gene interactions in disease association studies cannot be easily applied to handle the whole genome association study(GWAS) due to heavy computing. Therefore SNPHarvester is developed to find the main gene group among numerous genes. This research finds the superior gene groups which are related with the economic traits of the Korean beef cattle, not that of human, among sets of SNPs by using SNPHarvester, and also finds the superior genotypes which can enhance various qualities of Korean beef among SNP groups.

Complete Genome Analysis of Hyphantria cunea Nucleopolyhedrovirus Isolated in Korea (한국에서 분리한 미국흰불나방 핵다각체병 바이러스의 전장 유전체 분석)

  • Choi, Jae-Bang;Kim, Hyun-Soo;Woo, Soo-Dong
    • Korean Journal of Organic Agriculture
    • /
    • v.31 no.4
    • /
    • pp.395-412
    • /
    • 2023
  • The morphology and whole genome sequence of Hyphantria cunea nucleopolyhedrovirus W1 (HycuNPV-W1) isolated in Korea were analyzed for the use as an eco-friendly control agent against H. cunea. The HycuNPV-W1 had irregular tetrahedral polyhedra with a size of 1.5-2.2 ㎛ which is similar to that of previously reported HycuNPV isolated in Korea. As a result of whole viral genome analysis, HycuNPV-W1 was composed of 131,353 bp, which is 1,606 bp shorter than that of the previously reported HycuNPV. The G+C content was 45% and six of the homologous repeated regions were found, so there was no significant difference from the previous report. As a result of ORF analysis, HycuNPV-W1 contains total of 145 ORFs which is three ORFs less than the previous report, while two ORFs were exclusively found in HycuNPV-W1. The functions of these ORFs remains unclear and are not considered to have a significant influence on the characteristics of the HycuNPV. The genome vista analysis showed that the overall sequence identity between HycuNPV-W1 and the previously reported HycuNPV was very high. The whole genome of HycuNPV-W1 analyzed was found to be similar to those of the previously reported HycuNPV, however, it is supposed to be a novel resource in Korea with different isolate.

Identification of Causal and/or Rare Genetic Variants for Complex Traits by Targeted Resequencing in Population-based Cohorts

  • Kim, Yun-Kyoung;Hong, Chang-Bum;Cho, Yoon-Shin
    • Genomics & Informatics
    • /
    • v.8 no.3
    • /
    • pp.131-137
    • /
    • 2010
  • Genome-wide association studies (GWASs) have greatly contributed to the identification of common variants responsible for numerous complex traits. There are, however, unavoidable limitations in detecting causal and/or rare variants for traits in this approach, which depends on an LD-based tagging SNP microarray chip. In an effort to detect potential casual and/or rare variants for complex traits, such as type 2 diabetes (T2D) and triglycerides (TGs), we conducted a targeted resequencing of loci identified by the Korea Association REsource (KARE) GWAS. The target regions for resequencing comprised whole exons, exon-intron boundaries, and regulatory regions of genes that appeared within 1 Mb of the GWA signal boundary. From 124 individuals selected in population-based cohorts, a total of 0.7 Mb target regions were captured by the NimbleGen sequence capture 385K array. Subsequent sequencing, carried out by the Roche 454 Genome Sequencer FLX, generated about 110,000 sequence reads per individual. Mapping of sequence reads to the human reference genome was performed using the SSAHA2 program. An average of 62.2% of total reads was mapped to targets with an average 22X-fold coverage. A total of 5,983 SNPs (average 846 SNPs per individual) were called and annotated by GATK software, with 96.5% accuracy that was estimated by comparison with Affymetrix 5.0 genotyped data in identical individuals. About 51% of total SNPs were singletons that can be considered possible rare variants in the population. Among SNPs that appeared in exons, which occupies about 20% of total SNPs, 304 nonsynonymous singletons were tested with Polyphen to predict the protein damage caused by mutation. In total, we were able to detect 9 and 6 potentially functional rare SNPs for T2D and triglycerides, respectively, evoking a further step of replication genotyping in independent populations to prove their bona fide relevance to traits.

Whole-genome association and genome partitioning revealed variants and explained heritability for total number of teats in a Yorkshire pig population

  • Uzzaman, Md. Rasel;Park, Jong-Eun;Lee, Kyung-Tai;Cho, Eun-Seok;Choi, Bong-Hwan;Kim, Tae-Hun
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.31 no.4
    • /
    • pp.473-479
    • /
    • 2018
  • Objective: The study was designed to perform a genome-wide association (GWA) and partitioning of genome using Illumina's PorcineSNP60 Beadchip in order to identify variants and determine the explained heritability for the total number of teats in Yorkshire pig. Methods: After screening with the following criteria: minor allele frequency, $MAF{\leq}0.01$; Hardy-Weinberg equilibrium, $HWE{\leq}0.000001$, a pair-wise genomic relationship matrix was produced using 42,953 single nucleotide polymorphisms (SNPs). A genome-wide mixed linear model-based association analysis (MLMA) was conducted. And for estimating the explained heritability with genome- or chromosome-wide SNPs the genetic relatedness estimation through maximum likelihood approach was used in our study. Results: The MLMA analysis and false discovery rate p-values identified three significant SNPs on two different chromosomes (rs81476910 and rs81405825 on SSC8; rs81332615 on SSC13) for total number of teats. Besides, we estimated that 30% of variance could be explained by all of the common SNPs on the autosomal chromosomes for the trait. The maximum amount of heritability obtained by partitioning the genome were $0.22{\pm}0.05$, $0.16{\pm}0.05$, $0.10{\pm}0.03$ and $0.08{\pm}0.03$ on SSC7, SSC13, SSC1, and SSC8, respectively. Of them, SSC7 explained the amount of estimated heritability along with a SNP (rs80805264) identified by genome-wide association studies at the empirical p value significance level of 2.35E-05 in our study. Interestingly, rs80805264 was found in a nearby quantitative trait loci (QTL) on SSC7 for the teat number trait as identified in a recent study. Moreover, all other significant SNPs were found within and/or close to some QTLs related to ovary weight, total number of born alive and age at puberty in pigs. Conclusion: The SNPs we identified unquestionably represent some of the important QTL regions as well as genes of interest in the genome for various physiological functions responsible for reproduction in pigs.

Short Reads Phasing to Construct Haplotypes in Genomic Regions That Are Associated with Body Mass Index in Korean Individuals

  • Lee, Kichan;Han, Seonggyun;Tark, Yeonjeong;Kim, Sangsoo
    • Genomics & Informatics
    • /
    • v.12 no.4
    • /
    • pp.165-170
    • /
    • 2014
  • Genome-wide association (GWA) studies have found many important genetic variants that affect various traits. Since these studies are useful to investigate untyped but causal variants using linkage disequilibrium (LD), it would be useful to explore the haplotypes of single-nucleotide polymorphisms (SNPs) within the same LD block of significant associations based on high-density variants from population references. Here, we tried to make a haplotype catalog affecting body mass index (BMI) through an integrative analysis of previously published whole-genome next-generation sequencing (NGS) data of 7 representative Korean individuals and previously known Korean GWA signals. We selected 435 SNPs that were significantly associated with BMI from the GWA analysis and searched 53 LD ranges nearby those SNPs. With the NGS data, the haplotypes were phased within the LDs. A total of 44 possible haplotype blocks for Korean BMI were cataloged. Although the current result constitutes little data, this study provides new insights that may help to identify important haplotypes for traits and low variants nearby significant SNPs. Furthermore, we can build a more comprehensive catalog as a larger dataset becomes available.

Identification of SNPs Related to 19 Phenotypic Traits Using Genome-wide Association Study (GWAS) Approach in Korean Wheat Mini-core Collection

  • Yuna Kang;Yeonjun Sung;Seonghyeon Kim;Changsoo Kim
    • Proceedings of the Korean Society of Crop Science Conference
    • /
    • 2020.06a
    • /
    • pp.120-120
    • /
    • 2020
  • Based on the simple sequence repeat (SSR) marker, a Korean wheat core collection were established with 616 wheat accessions. Among them, the SNP genotyping for the entire genome was performed using DNA chip array to clarify the whole genome SNP profiles. Consequently, a total of 35,143 SNPs were found and we re-established a mini-core collection with 247 accessions. Population diversity and phylogenetic analysis revealed genetic diversity and relationships from the mini core set. In addition, genome-wide association study (GWAS) was performed on 19 phenotypic traits; ear type, awn length, culm length, ear length, awn color, seed coat color, culm color, ear color, loading, leaf length, leaf width, seeding stand, cold damage, weight, auricle, plant type, heading stage, maturation period, upright habit, and degree of flag leaf. The GWAS was performed using the fixed and random model circulating probability unification (FarmCPU), which identified 14 to 258 SNP loci related to 19 phenotypic traits. Our study indicates that this Korean wheat mini-core collection is a set of germplasm useful for basic and applied research with the aim of understanding and exploiting the genetic diversity of Korean wheat varieties.

  • PDF