• Title/Summary/Keyword: genome-wide association studies

Search Result 180, Processing Time 0.02 seconds

Application of Structural Equation Models to Genome-wide Association Analysis

  • Kim, Ji-Young;Namkung, Jung-Hyun;Lee, Seung-Mook;Park, Tae-Sung
    • Genomics & Informatics
    • /
    • v.8 no.3
    • /
    • pp.150-158
    • /
    • 2010
  • Genome-wise association studies (GWASs) have become popular approaches to identify genetic variants associated with human biological traits. In this study, we applied Structural Equation Models (SEMs) in order to model complex relationships between genetic networks and traits as risk factors. SEMs allow us to achieve a better understanding of biological mechanisms through identifying greater numbers of genes and pathways that are associated with a set of traits and the relationship among them. For efficient SEM analysis for GWASs, we developed a procedure, comprised of four stages. In the first stage, we conducted single-SNP analysis using regression models, where age, sex, and recruited area were included as adjusting covariates. In the second stage, Fisher's combination test was conducted for each gene to detect significant genes using p-values obtained from the single-SNP analysis. In the third stage, Fisher's exact test was adopted to determine which biological pathways were enriched with significant SNPs. Finally, based on a pathway that was associated with the four traits in common, a SEM was fit to model a causal relationship among the genetic factors and traits. We applied our SEM model to GWAS data with four central obesity related traits: suprailiac and subscapular measures for upper body fat, BMI, and hypertension. Study subjects were collected from two Korean cohort regions. After quality control, 327,872 SNPs for 8842 individuals were included in the analysis. After comparing two SEMs, we concluded that suprailiac and subscapular measures may indirectly affect hypertension susceptibility by influencing BMI. In conclusion, our analysis demonstrates that SEMs provide a better understanding of biological mechanisms by identifying greater numbers of genes and pathways.

EvoSNP-DB: A database of genetic diversity in East Asian populations

  • Kim, Young Uk;Kim, Young Jin;Lee, Jong-Young;Park, Kiejung
    • BMB Reports
    • /
    • v.46 no.8
    • /
    • pp.416-421
    • /
    • 2013
  • Genome-wide association studies (GWAS) have become popular as an approach for the identification of large numbers of phenotype-associated variants. However, differences in genetic architecture and environmental factors mean that the effect of variants can vary across populations. Understanding population genetic diversity is valuable for the investigation of possible population specific and independent effects of variants. EvoSNP-DB aims to provide information regarding genetic diversity among East Asian populations, including Chinese, Japanese, and Korean. Non-redundant SNPs (1.6 million) were genotyped in 54 Korean trios (162 samples) and were compared with 4 million SNPs from HapMap phase II populations. EvoSNP-DB provides two user interfaces for data query and visualization, and integrates scores of genetic diversity (Fst and VarLD) at the level of SNPs, genes, and chromosome regions. EvoSNP-DB is a web-based application that allows users to navigate and visualize measurements of population genetic differences in an interactive manner, and is available online at [http://biomi.cdc.go.kr/EvoSNP/].

Recapitulation of Candidate Systemic Lupus Erythematosus-Associated Variants in Koreans

  • Kwon, Ki-Sung;Cho, Hye-Young;Chung, Yeun-Jun
    • Genomics & Informatics
    • /
    • v.14 no.3
    • /
    • pp.85-89
    • /
    • 2016
  • Systemic lupus erythematosus (SLE) is a chronic autoimmune disease that affects multiple organ systems. Although the etiology of SLE remains unclear, it is widely accepted that genetic factors could be involved in its pathogenesis. A number of genome-wide association studies (GWASs) have identified novel single-nucleotide polymorphisms (SNPs) associated with the risk of SLE in diverse populations. However, not all the SNP candidates identified from non-Asian populations have been validated in Koreans. In this study, we aimed to replicate the SNPs that were recently discovered in the GWAS; these SNPs have not been validated in Koreans or have only been replicated in Koreans with an insufficient sample size to conclude any association. For this, we selected five SNPs (rs1801274 in FCGR2A and rs2286672 in PLD2, rs887369 in CXorf21, rs9782955 in LYST, and rs3794060 in NADSYN1). Through the replication study with 656 cases and 622 controls, rs1801274 in FCGR2A was found to be significantly associated with SLE in Koreans (odds ratio, 1.26, 95% confidence interval, 1.06 to 1.50; p = 0.01 in allelic model). This association was also significant in two other models (dominant and recessive). The other four SNPs did not show a significant association. Our data support that FCGR polymorphisms play important roles in the susceptibility to SLE in diverse populations, including Koreans.

Sample Size and Statistical Power Calculation in Genetic Association Studies

  • Hong, Eun-Pyo;Park, Ji-Wan
    • Genomics & Informatics
    • /
    • v.10 no.2
    • /
    • pp.117-122
    • /
    • 2012
  • A sample size with sufficient statistical power is critical to the success of genetic association studies to detect causal genes of human complex diseases. Genome-wide association studies require much larger sample sizes to achieve an adequate statistical power. We estimated the statistical power with increasing numbers of markers analyzed and compared the sample sizes that were required in case-control studies and case-parent studies. We computed the effective sample size and statistical power using Genetic Power Calculator. An analysis using a larger number of markers requires a larger sample size. Testing a single-nucleotide polymorphism (SNP) marker requires 248 cases, while testing 500,000 SNPs and 1 million markers requires 1,206 cases and 1,255 cases, respectively, under the assumption of an odds ratio of 2, 5% disease prevalence, 5% minor allele frequency, complete linkage disequilibrium (LD), 1:1 case/control ratio, and a 5% error rate in an allelic test. Under a dominant model, a smaller sample size is required to achieve 80% power than other genetic models. We found that a much lower sample size was required with a strong effect size, common SNP, and increased LD. In addition, studying a common disease in a case-control study of a 1:4 case-control ratio is one way to achieve higher statistical power. We also found that case-parent studies require more samples than case-control studies. Although we have not covered all plausible cases in study design, the estimates of sample size and statistical power computed under various assumptions in this study may be useful to determine the sample size in designing a population-based genetic association study.

Evaluation of Single Nucleotide Polymorphisms (SNPs) Genotyped by the Illumina Bovine SNP50K in Cattle Focusing on Hanwoo Breed

  • Dadi, Hailu;Kim, Jong-Joo;Yoon, Du-Hak;Kim, Kwan-Suk
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.25 no.1
    • /
    • pp.28-32
    • /
    • 2012
  • In the present study, we evaluated the informativeness of SNPs genotyped by the Illumina Bovine SNP50K assay in different cattle breeds. To investigate these on a genome-wide scale, we considered 52,678 SNPs spanning the whole autosomal and X chromosomes in cattle. Our study samples consists of six different cattle breeds. Across the breeds approximately 72 and 6% SNPs were found polymorphic and fixed or close to fix in all the breeds, respectively. The variations in the average minor allele frequency (MAF) were significantly different between the breeds studied. The level of average MAF observed in Hanwoo was significantly lower than the other breeds. Hanwoo breed also displayed the lowest number of polymorphic SNPs across all the chromosomes. More importantly, this study indicated that the Bovine SNP50K assay will have reduced power for genome-wide association studies in Hanwoo as compared to other cattle breeds. Overall, the Bovine SNP50K assay described in this study offer a useful genotyping platform for mapping quantitative trait loci (QTLs) in the cattle breeds. The assay data represent a vast and generally untapped resource to assist the investigation of the complex production traits and the development of marker-assisted selection programs.

Gene Expression Analysis and Polymorphism Discovery to Investigate Drought Responsive System in Tropical Maize

  • Song, Kitae;Kim, Hyo Chul;Kim, Kyung-Hee;Moon, Jun-Cheol;Kim, Jae Yoon;Lee, Sang-Kyu;Lee, Byung-Moo
    • Plant Breeding and Biotechnology
    • /
    • v.6 no.4
    • /
    • pp.354-362
    • /
    • 2018
  • Maize has high food and industrial value, whereas has difficulties in research because of their complex and huge size genome. Nested association mapping (NAM) was constructed to better understand maize genetics. However, most studies were conducted using the reference genome B73, and only a few studies were conducted on tropical maize. Ki3, one of the founder lines of the NAM population, is a tropical maize. We analyzed the genetic characteristics of Ki3 by using RNA sequencing and bioinformatics tools for various genetic studies. As results, a total of 30,526 genes were expressed, and expression profile were constructed. A total of 1,558 genes were differentially expressed in response to drought stress, and 513 contigs of them come from de novo assemblies. In addition, high-density polymorphisms including 464,930 single nucleotide polymorphisms (SNPs), 21,872 multiple nucleotide polymorphisms (MNPs) and 93,313 insertions and deletions (InDels) were found compared to reference genome. Among them, 15.0 % of polymorphisms (87,838) were passed non-synonymous test which could alter amino acid sequences. The variants have 66,550 SNPs, 5,853 MNPs, and 14,801 InDels, also proportion of homozygous type was higher than heterozygous. These variants were found in a total of 15,643 genes. Of these genes, 637 genes were found as differentially expressed genes (DEGs) under drought stress. Our results provide a genome-wide analysis of differentially expressed genes and information of variants on expressed genes of tropical maize under drought stress. Further characterization of these changes in genetic regulation and genetic traits will be of great value for improvement of maize genetics.

Identification of growth trait related genes in a Yorkshire purebred pig population by genome-wide association studies

  • Meng, Qingli;Wang, Kejun;Liu, Xiaolei;Zhou, Haishen;Xu, Li;Wang, Zhaojun;Fang, Meiying
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.30 no.4
    • /
    • pp.462-469
    • /
    • 2017
  • Objective: The aim of this study is to identify genomic regions or genes controlling growth traits in pigs. Methods: Using a panel of 54,148 single nucleotide polymorphisms (SNPs), we performed a genome-wide Association (GWA) study in 562 pure Yorshire pigs with four growth traits: average daily gain from 30 kg to 100 kg or 115 kg, and days to 100 kg or 115 kg. Fixed and random model Circulating Probability Unification method was used to identify the associations between 54,148 SNPs and these four traits. SNP annotations were performed through the Sus scrofa data set from Ensembl. Bioinformatics analysis, including gene ontology analysis, pathway analysis and network analysis, was used to identify the candidate genes. Results: We detected 6 significant and 12 suggestive SNPs, and identified 9 candidate genes in close proximity to them (suppressor of glucose by autophagy [SOGA1], R-Spondin 2 [RSPO2], mitogen activated protein kinase kinase 6 [MAP2K6], phospholipase C beta 1 [PLCB1], rho GTPASE activating protein 24 [ARHGAP24], cytoplasmic polyadenylation element binding protein 4 [CPEB4], GLI family zinc finger 2 [GLI2], neuronal tyrosine-phosphorylated phosphoinositide-3-kinase adaptor 2 [NYAP2], and zinc finger protein multitype 2 [ZFPM2]). Gene ontology analysis and literature mining indicated that the candidate genes are involved in bone, muscle, fat, and lung development. Pathway analysis revealed that PLCB1 and MAP2K6 participate in the gonadotropin signaling pathway and suggests that these two genes contribute to growth at the onset of puberty. Conclusion: Our results provide new clues for understanding the genetic mechanisms underlying growth traits, and may help improve these traits in future breeding programs.

Comparison of Genome-wide Association Study (GWAS) Algorithms for Detecting Genetic Variants Associated with Growth Traits in Olive Flounder Paralichthys olivaceus (넙치(Paralichthys olivaceus)의 성장형질 연관 유전자 변이 탐색을 위한 전장유전체연관분석(GWAS) 알고리즘 비교 분석 연구)

  • Sangwon Yoon;Heegun Lee;Jong-Won Park;Minhwan Jeong;Dain Lee;Hyo Sun Jung;Julan Kim;Hye-Rim Yang;Seung Hwan Lee;Jeong-Ho Lee
    • Korean Journal of Fisheries and Aquatic Sciences
    • /
    • v.56 no.4
    • /
    • pp.411-418
    • /
    • 2023
  • Genome wide association studies (GWAS) identify genetic loci associated with quantitative traits in genomic selection. Although several studies have compared performance of various algorithms, no study compares them in olive flounder Paralichthys olivaceus. This study compared the GWAS results of four mixed linear model (MLM) algorithms and one Fixed and random model Circulating Probability Unification (FarmCPU) algorithm in olive flounder. Considering gender and genetic association matrices as fixed and random effects, the MLM had stable performance without inflation for λGC (genomic inflation factor) of -log10P. The FarmCPU algorithm had some appropriate λGC of -log10P, and an upward tail was identified in quantile-quantile plots. Therefore, the models were suitable for detecting genetic variants associated with olive flounder growth traits. Moreover, significant genotypes appeared several times at chromosome 22, around which quantitative trait loci are expected to exist. Finally, in both models, some of the most genetic variants were found in genes related to growth traits, confirming their reliability. These results will be helpful when applied to the genomic selection of olive flounder growth traits in the future.

Replication of the Association between Copy Number Variation on 8p23.1 and Autism by Using ASD-specific BAC Array

  • Woo, Jung-Hoon;Yang, Song-Ju;Yim, Seon-Hee;Hu, Hae-Jin;Shin, Myung-Ju;Oh, Eun-Hee;Kang, Hyun-Woong;Park, Seon-Yang;Chung, Yeun-Jun
    • Genomics & Informatics
    • /
    • v.8 no.1
    • /
    • pp.19-27
    • /
    • 2010
  • To discover genetic markers for autism spectrum disorder (ASD), we previously applied genome-wide BAC array comparative genomic hybridization (array-CGH) to 28 autistic patients and 62 normal controls in Korean population, and identified that chromosomal losses on 8p23.1 and on 17p11.2 are significantly associated with autism. In this study, we developed an 8.5K ASD-specific BAC array covering 27 previously reported ASD-associated CNV loci including ours and examined whether the associations would be replicated in 8 ASD patient cell lines of four different ethnic groups and 10 Korean normal controls. As a result, a CNV-loss on 8p23.1 was found to be significantly more frequent in patients regardless of ethnicity (p<0.0001). This CNV region contains two coding genes, DEFA1 and DEFA3, which are members of DEFENSIN gene family. Two other CNVs on 17p11.2 and Xp22.31 were also distributed differently between ASDs and controls, but not significant (p=0.069 and 0.092, respectively). All the other loci did not show significant association. When these evidences are considered, the association between ASD and CNV of DEFENSIN gene seems worthy of further exploration to elucidate the pathogenesis of ASD. Validation studies with a larger sample size will be required to verify its biological implication.