DOI QR코드

DOI QR Code

Reconsideration of F1 Score as a Performance Measure in Mass Spectrometry-based Metabolomics

  • Jeong, Jaesik (Department of Statistics, Chonnam National University) ;
  • Kim, Han Sol (Department of Statistics, Chonnam National University) ;
  • Kim, Shin June (Department of Statistics, Chonnam National University)
  • Received : 2018.06.11
  • Accepted : 2018.09.25
  • Published : 2018.09.30

Abstract

Over the past decade, mass spectrometry-based metabolomics, especially two dimensional gas chromatography mass spectrometry (GCxGC/TOF-MS), has become a key analytical tool for metabolomics data because of its sensitivity and ability to analyze complex biological or biochemical sample. However, the need to reduce variations within/between experiments has been reported and methodological developments to overcome such problem has long been a critical issue. Along with methodological developments, developing reasonable performance measure has also been studied. Following four numerical measures have been typically used for comparison: sensitivity, specificity, receiver operating characteristic (ROC) curves, and positive predictive value (PPV). However, more recently, such measures are replaced with F1 score in many fields including metabolomics area without any carefulness of its validity. Thus, we want to investigate the validity of F1 score on two examples, with the goal of raising the awareness in choosing appropriate performance comparison measure. We noticed that F1 score itself, as a performance measure, was not good enough. Accordingly, we suggest that F1 score be supplemented with other performance measure such as specificity to improve its validity.

Keywords

References

  1. C.G. Frage, B. Prazen, and R. Synovec, "Objective data alignment and chemometric analysis of comprehensive two dimensional separations with runto-run peak shifting on both dimensions", Anal. Chem., Vol. 73, p. 5833, 2011.
  2. J. Jeong, S. Xue, X. Zhang, S. Kim, and C. Shen, "An empirical Bayes model using a competition score for metabolite identification in gas chromatography mass spectrometry", BMC Bioinformatics, Vol. 12, p. 392, 2011. https://doi.org/10.1186/1471-2105-12-392
  3. J. Jeong, X. Xue, X. Xiang, S. Kim, and C. Shen "Model-based peak alignment of metabolic profiling from comprehensive two dimensional gas chromatography mass spectrometry", BMC Bioinformatics, Vol. 13, p. 27, 2012. https://doi.org/10.1186/1471-2105-13-27
  4. J. Jeong, X. Zhang, X. Shi, S. Kim, and C. Shen "An efficient post-hoc integration method improving peak alignment of metabolomics data from GCxGC/TOF-MS", BMC Bioinformatics, Vol. 14, p. 123, 2013. https://doi.org/10.1186/1471-2105-14-123
  5. S. Kim, A. Fang, B. Wang, J. Jeong, and X. Zhang "An optimal peak alignment for comprehensive two-dimensional gas chromatography mass spectrometry using mixture similarity measure", Bioinformatics, Vol. 27, p. 1660, 2011. https://doi.org/10.1093/bioinformatics/btr188
  6. S. Kim, I. Koo, A. Fang, X. Zhang "Smith-Waterman peak alignment for comprehensive two-dimensional gas chromatography-mass spectrometry", BMC Bioinformatics, Vol. 12, p. 235, 2011. https://doi.org/10.1186/1471-2105-12-235
  7. S. Kim, M. Ouyang, C. Shen, and X. Zhang "A new method of peak detection for analysis of comprehensive two-dimensional gas chromatography mass spectrometry data", Ann. Appl. Stat., Vol. 8, p. 1209, 2014. https://doi.org/10.1214/14-AOAS731
  8. V. G. Mispelaar, A. C. Tas, A. K. Smilde, P. J. Schoenmakers, and A. C. Asten "Quantitative analysis of target components by comprehensive twodimensional gas chromatography", J. Chromatogr. A, Vol. 1019, p. 15, 2003. https://doi.org/10.1016/j.chroma.2003.08.101
  9. C. Oh, X. Huang, F. Regnier, C. Buck, and X. Zhang "Comprehensive two-dimensional gas chromatography/ time-of-flight mass spectrometry peak sorting algorithm", J. Chromatogra., Vol. 1179, p. 205, 2008. https://doi.org/10.1016/j.chroma.2007.11.101
  10. K. Pierce, L. Wood, B. Wright, and R. Synovec "A comprehensive two-dimensional retention time alignment algorithm to enhance chemometric analysis of comprehensive two-dimensional separation data", Anal. Chem., Vol. 77, p. 7735, 2005. https://doi.org/10.1021/ac0511142
  11. B. Wang, A. Fang, J. Heim, B. Bogdanov, S. Pugh, M. Libardoni, and X. Zhang "Disco: distance and spectrum correlation optimization alignment for two-dimensional gas chromatography time-of-flight mass spectrometry-based metabolomics", Anal. Chem., Vol. 83, p. 5069, 2010.
  12. X. Zhang, C. Oh, C. Riley, and C. Buck "Current status of computational approaches for protein identification using tandem mass spectra", Curr. Proteomics, Vol. 4, p. 121, 2007. https://doi.org/10.2174/157016407783221349
  13. M. Mohiyuddin, J. Mu, J. Li, N. Asadi, M. Gerstein, A. Abyzov, and W. Wong, H. Lam, "Metasv: an accurate and integrative structural-variant caller for next generation sequencing", Bioinformatics, Vol. 31, p. 2741, 2015. https://doi.org/10.1093/bioinformatics/btv204
  14. A. Pesaranhader, S. Matwin, M. Sokolova, and R. Beiko, "simdef: Definition-based semantic similarity measure of genontology terms for functional similarity analysis of genes", Bioinformatics, Vol. 32, p. 1380, 2016. https://doi.org/10.1093/bioinformatics/btv755
  15. D. Xu, M. Zhang, Y. Xie, F. Wang, M. Chen, K. Zhu, and J. Wei, "Dtminer: identification of potential disease targets through biomedical literature mining", Bioinformatics, Vol. 32, p. 1, 2016. https://doi.org/10.1093/bioinformatics/btw326