• Title/Summary/Keyword: Genotype Imputation

Search Result 11, Processing Time 0.026 seconds

A New Method for Imputation of Missing Genotype using Linkage Disequilibrium and Haplotype Information (결측치가 존재하는 유전형 자료에서의 연관불균형과 일배체형을 사용한 결측치 대치 방법)

  • Park Yun-Ju;Kim Young-Jin;Park Jung-Sun;Kim Kuchan;Koh Insong;Jung Ho-Youl
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.2
    • /
    • pp.99-107
    • /
    • 2005
  • In this paper, wc propose a now missing imputation method for minimizing loss of information linkage disequilibrium-based and haplotype-based imputation method, which estimate missing values of the data based on the specificity of Single Nucleotide Polymorphism(SNP) genotype data. Method for imputing data is needed to minimize the loss of information caused by experimental missing data. In general, missing imputation of biological data has used major allele imputation method. but this approach is not optima]. 1'his method has high error rates of missing values estimation since the characteristics of the genotype data are not considered not take into consideration the specific structure of the data. In this paper, we show the results of the comparative evaluation of our model methods and major imputation method for the estimation of missing values.

Accuracy of genotype imputation based on reference population size and marker density in Hanwoo cattle

  • Lee, DooHo;Kim, Yeongkuk;Chung, Yoonji;Lee, Dongjae;Seo, Dongwon;Choi, Tae Jeong;Lim, Dajeong;Yoon, Duhak;Lee, Seung Hwan
    • Journal of Animal Science and Technology
    • /
    • v.63 no.6
    • /
    • pp.1232-1246
    • /
    • 2021
  • Recently, the cattle genome sequence has been completed, followed by developing a commercial single nucleotide polymorphism (SNP) chip panel in the animal genome industry. In order to increase statistical power for detecting quantitative trait locus (QTL), a number of animals should be genotyped. However, a high-density chip for many animals would be increasing the genotyping cost. Therefore, statistical inference of genotype imputation (low-density chip to high-density) will be useful in the animal industry. The purpose of this study is to investigate the effect of the reference population size and marker density on the imputation accuracy and to suggest the appropriate number of reference population sets for the imputation in Hanwoo cattle. A total of 3,821 Hanwoo cattle were divided into reference and validation populations. The reference sets consisted of 50k (38,916) marker data and different population sizes (500, 1,000, 1,500, 2,000, and 3,600). The validation sets consisted of four validation sets (Total 889) and the different marker density (5k [5,000], 10k [10,000], and 15k [15,000]). The accuracy of imputation was calculated by direct comparison of the true genotype and the imputed genotype. In conclusion, when the lowest marker density (5k) was used in the validation set, according to the reference population size, the imputation accuracy was 0.793 to 0.929. On the other hand, when the highest marker density (15k), according to the reference population size, the imputation accuracy was 0.904 to 0.967. Moreover, the reference population size should be more than 1,000 to obtain at least 88% imputation accuracy in Hanwoo cattle.

Imputation Accuracy from Low to Moderate Density Single Nucleotide Polymorphism Chips in a Thai Multibreed Dairy Cattle Population

  • Jattawa, Danai;Elzo, Mauricio A.;Koonawootrittriron, Skorn;Suwanasopee, Thanathip
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.29 no.4
    • /
    • pp.464-470
    • /
    • 2016
  • The objective of this study was to investigate the accuracy of imputation from low density (LDC) to moderate density SNP chips (MDC) in a Thai Holstein-Other multibreed dairy cattle population. Dairy cattle with complete pedigree information (n = 1,244) from 145 dairy farms were genotyped with GeneSeek GGP20K (n = 570), GGP26K (n = 540) and GGP80K (n = 134) chips. After checking for single nucleotide polymorphism (SNP) quality, 17,779 SNP markers in common between the GGP20K, GGP26K, and GGP80K were used to represent MDC. Animals were divided into two groups, a reference group (n = 912) and a test group (n = 332). The SNP markers chosen for the test group were those located in positions corresponding to GeneSeek GGP9K (n = 7,652). The LDC to MDC genotype imputation was carried out using three different software packages, namely Beagle 3.3 (population-based algorithm), FImpute 2.2 (combined family- and population-based algorithms) and Findhap 4 (combined family- and population-based algorithms). Imputation accuracies within and across chromosomes were calculated as ratios of correctly imputed SNP markers to overall imputed SNP markers. Imputation accuracy for the three software packages ranged from 76.79% to 93.94%. FImpute had higher imputation accuracy (93.94%) than Findhap (84.64%) and Beagle (76.79%). Imputation accuracies were similar and consistent across chromosomes for FImpute, but not for Findhap and Beagle. Most chromosomes that showed either high (73%) or low (80%) imputation accuracies were the same chromosomes that had above and below average linkage disequilibrium (LD; defined here as the correlation between pairs of adjacent SNP within chromosomes less than or equal to 1 Mb apart). Results indicated that FImpute was more suitable than Findhap and Beagle for genotype imputation in this Thai multibreed population. Perhaps additional increments in imputation accuracy could be achieved by increasing the completeness of pedigree information.

Comparison of three boosting methods in parent-offspring trios for genotype imputation using simulation study

  • Mikhchi, Abbas;Honarvar, Mahmood;Kashan, Nasser Emam Jomeh;Zerehdaran, Saeed;Aminafshar, Mehdi
    • Journal of Animal Science and Technology
    • /
    • v.58 no.1
    • /
    • pp.1.1-1.6
    • /
    • 2016
  • Background: Genotype imputation is an important process of predicting unknown genotypes, which uses reference population with dense genotypes to predict missing genotypes for both human and animal genetic variations at a low cost. Machine learning methods specially boosting methods have been used in genetic studies to explore the underlying genetic profile of disease and build models capable of predicting missing values of a marker. Methods: In this study strategies and factors affecting the imputation accuracy of parent-offspring trios compared from lower-density SNP panels (5 K) to high density (10 K) SNP panel using three different Boosting methods namely TotalBoost (TB), LogitBoost (LB) and AdaBoost (AB). The methods employed using simulated data to impute the un-typed SNPs in parent-offspring trios. Four different datasets of G1 (100 trios with 5 k SNPs), G2 (100 trios with 10 k SNPs), G3 (500 trios with 5 k SNPs), and G4 (500 trio with 10 k SNPs) were simulated. In four datasets all parents were genotyped completely, and offspring genotyped with a lower density panel. Results: Comparison of the three methods for imputation showed that the LB outperformed AB and TB for imputation accuracy. The time of computation were different between methods. The AB was the fastest algorithm. The higher SNP densities resulted the increase of the accuracy of imputation. Larger trios (i.e. 500) was better for performance of LB and TB. Conclusions: The conclusion is that the three methods do well in terms of imputation accuracy also the dense chip is recommended for imputation of parent-offspring trios.

Association of HLA Genotype and Fulminant Type 1 Diabetes in Koreans

  • Kwak, Soo Heon;Kim, Yoon Ji;Chae, Jeesoo;Lee, Cue Hyunkyu;Han, Buhm;Kim, Jong-Il;Jung, Hye Seung;Cho, Young Min;Park, Kyong Soo
    • Genomics & Informatics
    • /
    • v.13 no.4
    • /
    • pp.126-131
    • /
    • 2015
  • Fulminant type 1 diabetes (T1DM) is a distinct subtype of T1DM that is characterized by rapid onset hyperglycemia, ketoacidosis, absolute insulin deficiency, and near normal levels of glycated hemoglobin at initial presentation. Although it has been reported that class II human leukocyte antigen (HLA) genotype is associated with fulminant T1DM, the genetic predisposition is not fully understood. In this study we investigated the HLA genotype and haplotype in 11 Korean cases of fulminant T1DM using imputation of whole exome sequencing data and compared its frequencies with 413 participants of the Korean Reference Panel. The $HLA-DRB1^*04:05-HLA-DQB1^*04:01$ haplotype was significantly associated with increased risk of fulminant T1DM in Fisher's exact test (odds ratio [OR], 4.11; 95% confidence interval [CI], 1.56 to 10.86; p = 0.009). A histidine residue at $HLA-DR{\beta}1$ position 13 was marginally associated with increased risk of fulminant T1DM (OR, 2.45; 95% CI, 1.01 to 5.94; p = 0.054). Although we had limited statistical power, we provide evidence that HLA haplotype and amino acid change can be a genetic risk factor of fulminant T1DM in Koreans. Further large-scale research is required to confirm these findings.

Accuracy of Imputation of Microsatellite Markers from BovineSNP50 and BovineHD BeadChip in Hanwoo Population of Korea

  • Sharma, Aditi;Park, Jong-Eun;Park, Byungho;Park, Mi-Na;Roh, Seung-Hee;Jung, Woo-Young;Lee, Seung-Hwan;Chai, Han-Ha;Chang, Gul-Won;Cho, Yong-Min;Lim, Dajeong
    • Genomics & Informatics
    • /
    • v.16 no.1
    • /
    • pp.10-13
    • /
    • 2018
  • Until now microsatellite (MS) have been a popular choice of markers for parentage verification. Recently many countries have moved or are in process of moving from MS markers to single nucleotide polymorphism (SNP) markers for parentage testing. FAO-ISAG has also come up with a panel of 200 SNPs to replace the use of MS markers in parentage verification. However, in many countries most of the animals were genotyped by MS markers till now and the sudden shift to SNP markers will render the data of those animals useless. As National Institute of Animal Science in South Korea plans to move from standard ISAG recommended MS markers to SNPs, it faces the dilemma of exclusion of old animals that were genotyped by MS markers. Thus to facilitate this shift from MS to SNPs, such that the existing animals with MS data could still be used for parentage verification, this study was performed. In the current study we performed imputation of MS markers from the SNPs in the 500-kb region of the MS marker on either side. This method will provide an easy option for the labs to combine the data from the old and the current set of animals. It will be a cost efficient replacement of genotyping with the additional markers. We used 1,480 Hanwoo animals with both the MS data and SNP data to impute in the validation animals. We also compared the imputation accuracy between BovineSNP50 and BovineHD BeadChip. In our study the genotype concordance of 40% and 43% was observed in the BovineSNP50 and BovineHD BeadChip respectively.

A Study on the Correlation between SLC25A26 Polymorphism and Gastritis and Gastric Ulcers in Koreans (한국인의 SLC25A26 유전자 다형성과 위염, 위궤양과의 상관성에 관한 연구)

  • Soyeun PARK;Dahyun HWANG
    • Korean Journal of Clinical Laboratory Science
    • /
    • v.55 no.4
    • /
    • pp.291-297
    • /
    • 2023
  • Gastritis is an inflammation of the gastric mucosa and gastric ulcers are a break in the mucosa of the stomach lining. Past research on gastritis and gastric ulcers has been mainly conducted from the perspective that environmental factors are the primary cause of these gastric diseases. However, recently the importance of genetic factors has been emphasized due to current developments in genetic research. The SLC25A26 gene is believed to be associated with the accumulation of reactive oxygen species. Oxidative stress promotes an inflammatory response, which increases the production of free radicals and causes cellular damage, and these lead to the development of gastric diseases. In this study, the correlation between SLC25A26 and gastric diseases was analyzed. Polymorphisms in SLC25A26 were analyzed in 1,369 domestic gastric disease patients and 7,471 healthy controls. As a result, 11 single nucleotide polymorphisms (SNPs) (in the genotype) and 13 SNPs (in the imputation) showed statistical significance (P<0.05), and high relative risk of gastric diseases. Among them, the rs13874 allele of SLC25A26 showed a highly significant association with gastric diseases. In the genotype-based mRNA expression analysis, the minor allele (C) group showed increased mRNA expression and this could increase oxidative stress. In conclusion, SLC25A26 polymorphisms are associated with gastric diseases. These results may provide a basis for new guidelines for gastric disease management in the Korean population.

Association Study of NDFIP2 Genetic Polymorphism with Asthma in the Korean Population (한국인에서 NDFIP2 유전적 다형성과 천식의 상관 연구)

  • Choi, Eun Hye;Hwang, Dahyun
    • Korean Journal of Clinical Laboratory Science
    • /
    • v.53 no.3
    • /
    • pp.249-256
    • /
    • 2021
  • Asthma is a chronic inflammatory airway disease. There are many factors including genetic and environmental factors that influence asthma. The mitogen-activated protein kinase (MAPK) pathway is involved in maintaining the T helper cells 1 and 2 (Th1/Th2) balance and plays an important role in the development of asthma. In this study, the correlation between the NDFIP2 gene that regulates the MAPK pathway and asthma was analyzed. The genetic polymorphism of the NDFIP2 gene was analyzed between 193 asthma patients and 3,228 healthy controls in Korea. As a result, 4 single nucleotide polymorphisms (SNPs) showed a significant correlation (P<0.05) and high relative risk with asthma. Among them, rs2783122 of NDFIP2 showed a statistically significant association with asthma (P-value=9.76×10-6, odds ratio (OR)=1.67, 95% confidence interval (CI)=1.33~2.10). In the SNP imputation on the NDFIP2, 16 SNPs were discovered, and all of them showed significant correlation with asthma and high odds ratio. The genotype-based mRNA expression analysis revealed that the group of minor alleles of rs1408049 showed increased mRNA expression. Increased NDFIP2 expression causes the activation of the MAPK pathway, and this may influence the development of asthma. In conclusion, the polymorphisms of NDFIP2 are associated with asthma development and this can provide the basis for new guidelines for the management of asthma in the Korean population.

Whole-genome sequence association study identifies cyclin dependent kinase 8 as a key gene for the number of mummified piglets

  • Pingxian, Wu;Dejuan, Chen;Kai, Wang;Shujie, Wang;Yihui, Liu;Anan, Jiang;Weihang, Xiao;Yanzhi, Jiang;Li, Zhu;Xu, Xu;Xiaotian, Qiu;Xuewei, Li;Guoqing, Tang
    • Animal Bioscience
    • /
    • v.36 no.1
    • /
    • pp.29-42
    • /
    • 2023
  • Objective: Pigs, an ideal biomedical model for human diseases, suffer from about 50% early embryonic and fetal death, a major cause of fertility loss worldwide. However, identifying the causal variant remains a huge challenge. This study aimed to detect single nucleotide polymorphisms (SNPs) and candidate genes for the number of mummified (NM) piglets using the imputed whole-genome sequence (WGS) and validate the potential candidate genes. Methods: The imputed WGS was introduced from genotyping-by-sequencing (GBS) using a multi-breed reference population. We performed genome-wide association studies (GWAS) for NM piglets at birth from a Landrace pig populatiGWAS peak located on SSC11: 0.10 to 7.11 Mbp (Top SNP, SSC11:1,889,658 bp; p = 9.98E-13) was identified in cyclin dependent kinase on. A total of 300 Landrace pigs were genotyped by GBS. The whole-genome variants were imputed, and 4,252,858 SNPs were obtained. Various molecular experiments were conducted to determine how the genes affected NM in pigs. Results: A strong GWAS peak located on SSC11: 0.10 to 7.11 Mbp (Top SNP, SSC11:1,889,658 bp; p = 9.98E-13) was identified in cyclin dependent kinase 8 (CDK8) gene, which plays a crucial role in embryonic retardation and lethality. Based on the molecular experiments, we found that Y-box binding protein 1 (YBX1) was a crucial transcription factor for CDK8, which mediated the effect of CDK8 in the proliferation of porcine ovarian granulosa cells via transforming growth factor beta/small mother against decapentaplegic signaling pathway, and, as a consequence, affected embryo quality, indicating that this pathway may be contributing to mummified fetal in pigs. Conclusion: A powerful imputation-based association study was performed to identify genes associated with NM in pigs. CDK8 was suggested as a functional gene for the proliferation of porcine ovarian granulosa cells, but further studies are required to determine causative mutations and the effect of loci on NM in pigs.

Association of Genetic Polymorphism of IL-2 Receptor Subunit and Tuberculosis Case

  • Lee, Sang-In;Jin, Hyun-Seok;Park, Sangjung
    • Biomedical Science Letters
    • /
    • v.24 no.2
    • /
    • pp.94-101
    • /
    • 2018
  • Tuberculosis (TB) is infectious disease caused by Mycobacterium tuberculosis (MTB) infection. It is known that not only the property of microorganism but also the genetic susceptibility of infected patients is controlled. Interleukin 2 (IL-2) is a cytokine belonging to type 1 T helper (Th1) activity. In addition, IL-2, when infected with MTB, binds IL-2 receptor and promotes T cell replication and is involved in granuloma formation. The aim of this study was to investigate the genetic polymorphisms of the IL-2 receptor gene in tuberculosis patients and normal individuals. We analyzed 22 SNPs in three genes using the genotype data of 443 tuberculosis cases and 3,228 healthy controls from the Korea Association Resource for their correlation with tuberculosis case. IL2RA, IL2RB, and IL2RG genes were genotyped of 16, 4, and 2 SNPs, respectively. Among three genes, only IL2RA gene polymorphisms showed statistically significant association with tuberculosis case. 6 SNPs with high significance were identified in the IL2RA gene. In addition, the linkage disequilibrium (LD) structure of IL2RA gene was confirmed. SNP imputation of IL2RA gene was performed, it was confirmed that more SNPs were significant between case and control. If we look at the results of IL2RA gene analysis above, we can see that genetic polymorphism in the gene expressing $IL-2R{\alpha}$ will regulate the expression level of $IL-2R{\alpha}$, and the change in the immune system involved in $IL-2R{\alpha}$. In this study, genetic polymorphism that may affect host immunity suggests that susceptibility to tuberculosis may be controlled.