• 제목/요약/키워드: Genotype Imputation

검색결과 11건 처리시간 0.031초

결측치가 존재하는 유전형 자료에서의 연관불균형과 일배체형을 사용한 결측치 대치 방법 (A New Method for Imputation of Missing Genotype using Linkage Disequilibrium and Haplotype Information)

  • 박윤주;김영진;박정선;김규찬;고인송;정호열
    • 한국정보과학회논문지:소프트웨어및응용
    • /
    • 제32권2호
    • /
    • pp.99-107
    • /
    • 2005
  • 본 논문에서는 단일염기변이(SNP: Single Nucleotide Polymorphism)와 같은 유전형(Rcnotype)자료에서 결측치가 발생하였을 경우 유전형 자료의 특이성을 고려해 자료 원래의 정보손실을 최소화하는 대치법인 연관불균형 기반의 대치법(linkage disequilibrium- based imputation)과 일배체형 기반의 대치법(haplotype-based imputation)을 제시한다. 이러한 결측치 대치는 실험상에서 발생하는 결측치에 의한 중요한 정보의 손실을 최소화 한다는 점에서 필요한 방법이다. 일반적으로 그동안 생물학 자료의 결측치 대치는 대부분 주형질 대치법(major allele imputation)이 활용되어왔는데 유전형 자료에서의 이 방법의 사용은 사료의 특이성으로 인하여 결측치에 대한 높은 오차율(error rate)을 보임으로서 자료의 신뢰성을 떨어뜨릴 수 있다. 본 논문에서는 유전형 자료인 단일염기변이 자료의 시뮬레이션을 통하여 기존의 주형질 대치법과 논문에서 제안된 연관불균형 기반의 대치법과 일배체형 기반의 대치법을 비교하고 그 결과를 보여 준다.

Accuracy of genotype imputation based on reference population size and marker density in Hanwoo cattle

  • Lee, DooHo;Kim, Yeongkuk;Chung, Yoonji;Lee, Dongjae;Seo, Dongwon;Choi, Tae Jeong;Lim, Dajeong;Yoon, Duhak;Lee, Seung Hwan
    • Journal of Animal Science and Technology
    • /
    • 제63권6호
    • /
    • pp.1232-1246
    • /
    • 2021
  • Recently, the cattle genome sequence has been completed, followed by developing a commercial single nucleotide polymorphism (SNP) chip panel in the animal genome industry. In order to increase statistical power for detecting quantitative trait locus (QTL), a number of animals should be genotyped. However, a high-density chip for many animals would be increasing the genotyping cost. Therefore, statistical inference of genotype imputation (low-density chip to high-density) will be useful in the animal industry. The purpose of this study is to investigate the effect of the reference population size and marker density on the imputation accuracy and to suggest the appropriate number of reference population sets for the imputation in Hanwoo cattle. A total of 3,821 Hanwoo cattle were divided into reference and validation populations. The reference sets consisted of 50k (38,916) marker data and different population sizes (500, 1,000, 1,500, 2,000, and 3,600). The validation sets consisted of four validation sets (Total 889) and the different marker density (5k [5,000], 10k [10,000], and 15k [15,000]). The accuracy of imputation was calculated by direct comparison of the true genotype and the imputed genotype. In conclusion, when the lowest marker density (5k) was used in the validation set, according to the reference population size, the imputation accuracy was 0.793 to 0.929. On the other hand, when the highest marker density (15k), according to the reference population size, the imputation accuracy was 0.904 to 0.967. Moreover, the reference population size should be more than 1,000 to obtain at least 88% imputation accuracy in Hanwoo cattle.

Imputation Accuracy from Low to Moderate Density Single Nucleotide Polymorphism Chips in a Thai Multibreed Dairy Cattle Population

  • Jattawa, Danai;Elzo, Mauricio A.;Koonawootrittriron, Skorn;Suwanasopee, Thanathip
    • Asian-Australasian Journal of Animal Sciences
    • /
    • 제29권4호
    • /
    • pp.464-470
    • /
    • 2016
  • The objective of this study was to investigate the accuracy of imputation from low density (LDC) to moderate density SNP chips (MDC) in a Thai Holstein-Other multibreed dairy cattle population. Dairy cattle with complete pedigree information (n = 1,244) from 145 dairy farms were genotyped with GeneSeek GGP20K (n = 570), GGP26K (n = 540) and GGP80K (n = 134) chips. After checking for single nucleotide polymorphism (SNP) quality, 17,779 SNP markers in common between the GGP20K, GGP26K, and GGP80K were used to represent MDC. Animals were divided into two groups, a reference group (n = 912) and a test group (n = 332). The SNP markers chosen for the test group were those located in positions corresponding to GeneSeek GGP9K (n = 7,652). The LDC to MDC genotype imputation was carried out using three different software packages, namely Beagle 3.3 (population-based algorithm), FImpute 2.2 (combined family- and population-based algorithms) and Findhap 4 (combined family- and population-based algorithms). Imputation accuracies within and across chromosomes were calculated as ratios of correctly imputed SNP markers to overall imputed SNP markers. Imputation accuracy for the three software packages ranged from 76.79% to 93.94%. FImpute had higher imputation accuracy (93.94%) than Findhap (84.64%) and Beagle (76.79%). Imputation accuracies were similar and consistent across chromosomes for FImpute, but not for Findhap and Beagle. Most chromosomes that showed either high (73%) or low (80%) imputation accuracies were the same chromosomes that had above and below average linkage disequilibrium (LD; defined here as the correlation between pairs of adjacent SNP within chromosomes less than or equal to 1 Mb apart). Results indicated that FImpute was more suitable than Findhap and Beagle for genotype imputation in this Thai multibreed population. Perhaps additional increments in imputation accuracy could be achieved by increasing the completeness of pedigree information.

Comparison of three boosting methods in parent-offspring trios for genotype imputation using simulation study

  • Mikhchi, Abbas;Honarvar, Mahmood;Kashan, Nasser Emam Jomeh;Zerehdaran, Saeed;Aminafshar, Mehdi
    • Journal of Animal Science and Technology
    • /
    • 제58권1호
    • /
    • pp.1.1-1.6
    • /
    • 2016
  • Background: Genotype imputation is an important process of predicting unknown genotypes, which uses reference population with dense genotypes to predict missing genotypes for both human and animal genetic variations at a low cost. Machine learning methods specially boosting methods have been used in genetic studies to explore the underlying genetic profile of disease and build models capable of predicting missing values of a marker. Methods: In this study strategies and factors affecting the imputation accuracy of parent-offspring trios compared from lower-density SNP panels (5 K) to high density (10 K) SNP panel using three different Boosting methods namely TotalBoost (TB), LogitBoost (LB) and AdaBoost (AB). The methods employed using simulated data to impute the un-typed SNPs in parent-offspring trios. Four different datasets of G1 (100 trios with 5 k SNPs), G2 (100 trios with 10 k SNPs), G3 (500 trios with 5 k SNPs), and G4 (500 trio with 10 k SNPs) were simulated. In four datasets all parents were genotyped completely, and offspring genotyped with a lower density panel. Results: Comparison of the three methods for imputation showed that the LB outperformed AB and TB for imputation accuracy. The time of computation were different between methods. The AB was the fastest algorithm. The higher SNP densities resulted the increase of the accuracy of imputation. Larger trios (i.e. 500) was better for performance of LB and TB. Conclusions: The conclusion is that the three methods do well in terms of imputation accuracy also the dense chip is recommended for imputation of parent-offspring trios.

Association of HLA Genotype and Fulminant Type 1 Diabetes in Koreans

  • Kwak, Soo Heon;Kim, Yoon Ji;Chae, Jeesoo;Lee, Cue Hyunkyu;Han, Buhm;Kim, Jong-Il;Jung, Hye Seung;Cho, Young Min;Park, Kyong Soo
    • Genomics & Informatics
    • /
    • 제13권4호
    • /
    • pp.126-131
    • /
    • 2015
  • Fulminant type 1 diabetes (T1DM) is a distinct subtype of T1DM that is characterized by rapid onset hyperglycemia, ketoacidosis, absolute insulin deficiency, and near normal levels of glycated hemoglobin at initial presentation. Although it has been reported that class II human leukocyte antigen (HLA) genotype is associated with fulminant T1DM, the genetic predisposition is not fully understood. In this study we investigated the HLA genotype and haplotype in 11 Korean cases of fulminant T1DM using imputation of whole exome sequencing data and compared its frequencies with 413 participants of the Korean Reference Panel. The $HLA-DRB1^*04:05-HLA-DQB1^*04:01$ haplotype was significantly associated with increased risk of fulminant T1DM in Fisher's exact test (odds ratio [OR], 4.11; 95% confidence interval [CI], 1.56 to 10.86; p = 0.009). A histidine residue at $HLA-DR{\beta}1$ position 13 was marginally associated with increased risk of fulminant T1DM (OR, 2.45; 95% CI, 1.01 to 5.94; p = 0.054). Although we had limited statistical power, we provide evidence that HLA haplotype and amino acid change can be a genetic risk factor of fulminant T1DM in Koreans. Further large-scale research is required to confirm these findings.

Accuracy of Imputation of Microsatellite Markers from BovineSNP50 and BovineHD BeadChip in Hanwoo Population of Korea

  • Sharma, Aditi;Park, Jong-Eun;Park, Byungho;Park, Mi-Na;Roh, Seung-Hee;Jung, Woo-Young;Lee, Seung-Hwan;Chai, Han-Ha;Chang, Gul-Won;Cho, Yong-Min;Lim, Dajeong
    • Genomics & Informatics
    • /
    • 제16권1호
    • /
    • pp.10-13
    • /
    • 2018
  • Until now microsatellite (MS) have been a popular choice of markers for parentage verification. Recently many countries have moved or are in process of moving from MS markers to single nucleotide polymorphism (SNP) markers for parentage testing. FAO-ISAG has also come up with a panel of 200 SNPs to replace the use of MS markers in parentage verification. However, in many countries most of the animals were genotyped by MS markers till now and the sudden shift to SNP markers will render the data of those animals useless. As National Institute of Animal Science in South Korea plans to move from standard ISAG recommended MS markers to SNPs, it faces the dilemma of exclusion of old animals that were genotyped by MS markers. Thus to facilitate this shift from MS to SNPs, such that the existing animals with MS data could still be used for parentage verification, this study was performed. In the current study we performed imputation of MS markers from the SNPs in the 500-kb region of the MS marker on either side. This method will provide an easy option for the labs to combine the data from the old and the current set of animals. It will be a cost efficient replacement of genotyping with the additional markers. We used 1,480 Hanwoo animals with both the MS data and SNP data to impute in the validation animals. We also compared the imputation accuracy between BovineSNP50 and BovineHD BeadChip. In our study the genotype concordance of 40% and 43% was observed in the BovineSNP50 and BovineHD BeadChip respectively.

한국인의 SLC25A26 유전자 다형성과 위염, 위궤양과의 상관성에 관한 연구 (A Study on the Correlation between SLC25A26 Polymorphism and Gastritis and Gastric Ulcers in Koreans)

  • 박소연;황다현
    • 대한임상검사과학회지
    • /
    • 제55권4호
    • /
    • pp.291-297
    • /
    • 2023
  • 위염과 위궤양은 위 점막에 염증이 생기고 상처가 생기는 것을 말한다. 과거 연구는 주로 환경적 요인이 위 질환의 주요 요인이라는 관점에서 이루어졌으나, 최근 유전자 연구의 발전으로 유전적 요인의 중요성이 강조되고 있다. SLC25A26은 활성산 소종의 축적과 관련이 있는 유전자이다. 산화 스트레스는 염증반응을 촉진하여 활성 산소를 증가시키고 세포 손상을 유발하기 때문에 이는 위 질환의 발생과 관련이 있을 것이라 추정된다. 본 연구에서는 SLC25A26과 위 질환과의 연관성을 분석하였다. 국내 위 질환 환자 1,369명과 건강한 대조군 7,471명을 대상으로 SLC25A26 내 다형성을 분석하였다. 그 결과 11개의 단일 염기 다형성(single nucleotide polymorphism, SNP) (genotype)과 13개의 SNP (imputation)가 통계적인 유의성(P<0.05)을 가지고 높은 위 질환과의 상대 위험도를 보였다. 그 중 SLC25A26의 rs13874가 위 질환과 높은 연관성을 보였다. 유전자형 기반 mRNA 발현 분석에 따르면 SLC25A26이 minor allele를 가지면 mRNA 발현이 증가하고 이는 산화 스트레스를 증가시킬 가능성이 있다. 결론적으로 SLC25A26 다형성은 위질환과 관련이 있어 우리나라 인구에서 위 질환 관리의 새로운 지침에 대한 근거를 제공할 수 있을 것이다.

한국인에서 NDFIP2 유전적 다형성과 천식의 상관 연구 (Association Study of NDFIP2 Genetic Polymorphism with Asthma in the Korean Population)

  • 최은혜;황다현
    • 대한임상검사과학회지
    • /
    • 제53권3호
    • /
    • pp.249-256
    • /
    • 2021
  • 천식은 만성 염증성 기도 폐쇄 질환이다. 질병 발생 요인은 다양하며 특히, 유전적 요인과 환경적 요인이 천식 발병에 영향을 미치는 것으로 추정된다. MAPK (mitogen-activated protein kinase)경로는 Th1/Th2의 균형을 조절하며, 천식 발생에 중요한 역할을 하는 것으로 알려져 있다. 본 연구에서는 MAPK 경로를 조절하는 NDFIP2 유전자와 천식 발병과의 상관관계를 분석하였다. 193건의 천식 환자와 3,228건의 정상 대조군의 유전형 데이터를 사용하였다. 그 결과 NDFIP2 안에 있는 4개의 SNP이 천식과 유의한 상관관계와 높은 상대적 위험도를 보였다. 특히 NDFIP2의 rs2783122는 천식과 통계적으로 가장 유의한 연관성을 나타냈다(P-value=9.76×10-6, OR=1.67, 95% CI=1.33~2.10). NDFIP2 유전자에 대한 SNP imputation 결과 16개의 SNP가 추가 발견되었으며, 모두 유의한 상관 관계와 높은 상대적 위험도를 나타냈다. 유전자형 기반 mRNA 발현 분석을 통해 rs1408049가 minor allele을 가질 경우 유전자 발현이 증가됨을 알 수 있었다. 증가된 NDFIP2 발현은 MAPK 경로를 활성화시켜 천식 발병에 영향을 미칠 수 있다. 결론적으로 NDFIP2의 다형성은 천식 발병과 관련이 있으며, 이는 한국 인구의 천식 관리에 대한 새로운 지침을 제공할 수 있다.

Whole-genome sequence association study identifies cyclin dependent kinase 8 as a key gene for the number of mummified piglets

  • Pingxian, Wu;Dejuan, Chen;Kai, Wang;Shujie, Wang;Yihui, Liu;Anan, Jiang;Weihang, Xiao;Yanzhi, Jiang;Li, Zhu;Xu, Xu;Xiaotian, Qiu;Xuewei, Li;Guoqing, Tang
    • Animal Bioscience
    • /
    • 제36권1호
    • /
    • pp.29-42
    • /
    • 2023
  • Objective: Pigs, an ideal biomedical model for human diseases, suffer from about 50% early embryonic and fetal death, a major cause of fertility loss worldwide. However, identifying the causal variant remains a huge challenge. This study aimed to detect single nucleotide polymorphisms (SNPs) and candidate genes for the number of mummified (NM) piglets using the imputed whole-genome sequence (WGS) and validate the potential candidate genes. Methods: The imputed WGS was introduced from genotyping-by-sequencing (GBS) using a multi-breed reference population. We performed genome-wide association studies (GWAS) for NM piglets at birth from a Landrace pig populatiGWAS peak located on SSC11: 0.10 to 7.11 Mbp (Top SNP, SSC11:1,889,658 bp; p = 9.98E-13) was identified in cyclin dependent kinase on. A total of 300 Landrace pigs were genotyped by GBS. The whole-genome variants were imputed, and 4,252,858 SNPs were obtained. Various molecular experiments were conducted to determine how the genes affected NM in pigs. Results: A strong GWAS peak located on SSC11: 0.10 to 7.11 Mbp (Top SNP, SSC11:1,889,658 bp; p = 9.98E-13) was identified in cyclin dependent kinase 8 (CDK8) gene, which plays a crucial role in embryonic retardation and lethality. Based on the molecular experiments, we found that Y-box binding protein 1 (YBX1) was a crucial transcription factor for CDK8, which mediated the effect of CDK8 in the proliferation of porcine ovarian granulosa cells via transforming growth factor beta/small mother against decapentaplegic signaling pathway, and, as a consequence, affected embryo quality, indicating that this pathway may be contributing to mummified fetal in pigs. Conclusion: A powerful imputation-based association study was performed to identify genes associated with NM in pigs. CDK8 was suggested as a functional gene for the proliferation of porcine ovarian granulosa cells, but further studies are required to determine causative mutations and the effect of loci on NM in pigs.

Association of Genetic Polymorphism of IL-2 Receptor Subunit and Tuberculosis Case

  • Lee, Sang-In;Jin, Hyun-Seok;Park, Sangjung
    • 대한의생명과학회지
    • /
    • 제24권2호
    • /
    • pp.94-101
    • /
    • 2018
  • Tuberculosis (TB) is infectious disease caused by Mycobacterium tuberculosis (MTB) infection. It is known that not only the property of microorganism but also the genetic susceptibility of infected patients is controlled. Interleukin 2 (IL-2) is a cytokine belonging to type 1 T helper (Th1) activity. In addition, IL-2, when infected with MTB, binds IL-2 receptor and promotes T cell replication and is involved in granuloma formation. The aim of this study was to investigate the genetic polymorphisms of the IL-2 receptor gene in tuberculosis patients and normal individuals. We analyzed 22 SNPs in three genes using the genotype data of 443 tuberculosis cases and 3,228 healthy controls from the Korea Association Resource for their correlation with tuberculosis case. IL2RA, IL2RB, and IL2RG genes were genotyped of 16, 4, and 2 SNPs, respectively. Among three genes, only IL2RA gene polymorphisms showed statistically significant association with tuberculosis case. 6 SNPs with high significance were identified in the IL2RA gene. In addition, the linkage disequilibrium (LD) structure of IL2RA gene was confirmed. SNP imputation of IL2RA gene was performed, it was confirmed that more SNPs were significant between case and control. If we look at the results of IL2RA gene analysis above, we can see that genetic polymorphism in the gene expressing $IL-2R{\alpha}$ will regulate the expression level of $IL-2R{\alpha}$, and the change in the immune system involved in $IL-2R{\alpha}$. In this study, genetic polymorphism that may affect host immunity suggests that susceptibility to tuberculosis may be controlled.