DOI QR코드

DOI QR Code

Tracing the breeding farm of domesticated pig using feature selection (Sus scrofa)

  • Kwon, Taehyung (Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University) ;
  • Yoon, Joon (Interdisciplinary Program in Bioinformatics Department of Natural Science, Seoul National University) ;
  • Heo, Jaeyoung (International Agricultural Development and Cooperation Center, Chonbuk National University) ;
  • Lee, Wonseok (Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University) ;
  • Kim, Heebal (Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University)
  • Received : 2017.07.28
  • Accepted : 2017.10.09
  • Published : 2017.11.01

Abstract

Objective: Increasing food safety demands in the animal product market have created a need for a system to trace the food distribution process, from the manufacturer to the retailer, and genetic traceability is an effective method to trace the origin of animal products. In this study, we successfully achieved the farm tracing of 6,018 multi-breed pigs, using single nucleotide polymorphism (SNP) markers strictly selected through least absolute shrinkage and selection operator (LASSO) feature selection. Methods: We performed farm tracing of domesticated pig (Sus scrofa) from SNP markers and selected the most relevant features for accurate prediction. Considering multi-breed composition of our data, we performed feature selection using LASSO penalization on 4,002 SNPs that are shared between breeds, which also includes 179 SNPs with small between-breed difference. The 100 highest-scored features were extracted from iterative simulations and then evaluated using machine-leaning based classifiers. Results: We selected 1,341 SNPs from over 45,000 SNPs through iterative LASSO feature selection, to minimize between-breed differences. We subsequently selected 100 highest-scored SNPs from iterative scoring, and observed high statistical measures in classification of breeding farms by cross-validation only using these SNPs. Conclusion: The study represents a successful application of LASSO feature selection on multi-breed pig SNP data to trace the farm information, which provides a valuable method and possibility for further researches on genetic traceability.

Keywords

References

  1. Dalvit C, De Marchi M, Cassandro M. Genetic traceability of livestock products: A review. Meat Sci 2007;77:437-49. https://doi.org/10.1016/j.meatsci.2007.05.027
  2. McKean J. The importance of traceability for public health and consumer protection. Revue scientifique et technique. Rev Sci Tech (International Office of Epizootics) 2001;20:363-71.
  3. Tang GQ, Xue J, Lian MJ, et al. Inbreeding and genetic diversity in three imported swine breeds in china using pedigree data. Asian-Australas J Anim Sci 2013;26:755-65. https://doi.org/10.5713/ajas.2012.12645
  4. Moon S, Kim T-H, Lee K-T, et al. A genome-wide scan for signatures of directional selection in domesticated pigs. BMC Genomics 2015;16:130. https://doi.org/10.1186/s12864-015-1330-x
  5. Laval G, Iannuccelli N, Legault C, et al. Genetic diversity of eleven European pig breeds. Genet Sel Evol 2000;32:187-203. https://doi.org/10.1186/1297-9686-32-2-187
  6. Rubin C-J, Megens H-J, Barrio AM, et al. Strong signatures of selection in the domestic pig genome. Proc Natl Acad Sci USA 2012;109:19529-36. https://doi.org/10.1073/pnas.1217149109
  7. Wilkinson S, Lu ZH, Megens H-J, et al. Signatures of diversifying selection in European pig breeds. PLoS Genet 2013;9:e1003453. https://doi.org/10.1371/journal.pgen.1003453
  8. Wiener P, Wilkinson S. Deciphering the genetic basis of animal domestication. Proc R Soc Lond B Biol Sci 2011:rspb20111376.
  9. Bernard C, Chapman A, Grummer R. Selection of pigs under farm conditions: Kind and amount practiced and a recommended selection index. J Anim Sci 1954;13:389-404. https://doi.org/10.2527/jas1954.132389x
  10. Fowler V, Bichard M, Pease A. Objectives in pig breeding. Anim Sci 1976;23:365-87.
  11. Saeys Y, Inza I, Larranaga P. A review of feature selection techniques in bioinformatics. Bioinformatics 2007;23:2507-17. https://doi.org/10.1093/bioinformatics/btm344
  12. Ghosh D, Chinnaiyan AM. Classification and selection of biomarkers in genomic data using LASSO. Biomed Res Int 2005;2005:147-54.
  13. Even-Zohar Y, Roth D. A sequential model for multi-class classification. arXiv preprint cs/0106044 2001.
  14. Purcell S, Neale B, Todd-Brown K, et al. PLINK: a tool set for wholegenome association and population-based linkage analyses. Am J Hum Genet 2007;81:559-75. https://doi.org/10.1086/519795
  15. Team RC. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2014. R Foundation for Statistical Computing; 2016.
  16. Zheng X, Levine D, Shen J, et al. A high-performance computing toolset for relatedness and principal component analysis of SNP data. Bioinformatics 2012;28:3326-8. https://doi.org/10.1093/bioinformatics/bts606
  17. Alexander DH, Novembre J, Lange K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res 2009;19:1655-64. https://doi.org/10.1101/gr.094052.109
  18. Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J Stat Softw 2010;33:1.
  19. Hornik K, Buchta C, Zeileis A. Open-source machine learning: R meets Weka. Comput Stat 2009;24:225-32. https://doi.org/10.1007/s00180-008-0119-7
  20. Makhoul J, Kubala F, Schwartz R, Weischedel R. Performance measures for information extraction. In: Proceedings of DARPA broadcast news workshop; 1999. p. 249-52.
  21. Guyon I, Elisseeff A. An introduction to variable and feature selection. J Mach Learn Res 2003;3:1157-82.

Cited by

  1. Selection of Optimal Ancestry Informative Markers for Classification and Ancestry Proportion Estimation in Pigs vol.10, pp.None, 2017, https://doi.org/10.3389/fgene.2019.00183
  2. The efficiency of pig breeding by maturity and bacon thickness at different selection intensities vol.17, pp.None, 2020, https://doi.org/10.1051/bioconf/20201700017