DOI QR코드

DOI QR Code

MPI-GWAS: a supercomputing-aided permutation approach for genome-wide association studies

  • Paik, Hyojung (Division of Supercomputing, Center for Supercomputing Application and Research, Korea Institute of Science and Technology Information (KISTI)) ;
  • Cho, Yongseong (Division of Supercomputing, Center for Supercomputing Application and Research, Korea Institute of Science and Technology Information (KISTI)) ;
  • Cho, Seong Beom (Department of Bio-Medical Informatics, Gachon University College of Medicine) ;
  • Kwon, Oh-Kyoung (Division of Supercomputing, Center for Supercomputing Application and Research, Korea Institute of Science and Technology Information (KISTI))
  • Received : 2022.01.03
  • Accepted : 2022.02.10
  • Published : 2022.03.31

Abstract

Permutation testing is a robust and popular approach for significance testing in genomic research that has the advantage of reducing inflated type 1 error rates; however, its computational cost is notorious in genome-wide association studies (GWAS). Here, we developed a supercomputing-aided approach to accelerate the permutation testing for GWAS, based on the message-passing interface (MPI) on parallel computing architecture. Our application, called MPI-GWAS, conducts MPI-based permutation testing using a parallel computing approach with our supercomputing system, Nurion (8,305 compute nodes, and 563,740 central processing units [CPUs]). For 107 permutations of one locus in MPI-GWAS, it was calculated in 600 s using 2,720 CPU cores. For 107 permutations of ~30,000-50,000 loci in over 7,000 subjects, the total elapsed time was ~4 days in the Nurion supercomputer. Thus, MPI-GWAS enables us to feasibly compute the permutation-based GWAS within a reason-able time by harnessing the power of parallel computing resources.

Keywords

Acknowledgement

The Korea Institute of Science and Technology Information (KIS-TI) (K-21-L02-C10, K-20-L02-C10-S01, K-21-L02-C10-S01), and the Program of the National Research Foundation (NRF) funded by the Korean government (MSIT) (2021M3H9A203052011). HP and YC were also supported by the Ministry of Science and ICT (N-21-NM-CA08-S01). This work was also supported by the National Supercomputing Center with supercomputing resources including technical support (TS-2021-RG-0006).

References

  1. Ozaki K, Ohnishi Y, Iida A, Sekine A, Yamada R, Tsunoda T, et al. Functional SNPs in the lymphotoxin-alpha gene that are associated with susceptibility to myocardial infarc-tion. Nat Genet 2002;32:650-654. https://doi.org/10.1038/ng1047
  2. Turuspekov Y, Baibulatova A, Yermekbayev K, Tokhetova L, Chudinov V, Sereda G, et al. GWAS for plant growth stages and yield components in spring wheat (Triticum aestivum L.) harvested in three regions of Kazakhstan. BMC Plant Biol 2017;17(Suppl 1):190. https://doi.org/10.1186/s12870-017-1131-2
  3. Che R, Jack JR, Motsinger-Reif AA, Brown CC. An adaptive permutation approach for genome-wide association study: evaluation and recommendations for use. BioData Min 2014;7:9. https://doi.org/10.1186/1756-0381-7-9
  4. Kim Y, Han BG; KoGES Group. Cohort profile: the Korean Genome and Epidemiology Study (KoGES) Consortium. Int J Epidemiol 2017;46:e20. https://doi.org/10.1093/ije/dyv316
  5. Sudlow C, Gallacher J, Allen N, Beral V, Burton P, Danesh J, et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med 2015;12:e1001779. https://doi.org/10.1371/journal.pmed.1001779