A Wavelet based Feature Selection Method to Improve Classification of Large Signal-type Data

웨이블릿에 기반한 시그널 형태를 지닌 대형 자료의 feature 추출 방법

  • Jang, Woosung (Department of Industrial Engineering, Seoul National University) ;
  • Chang, Woojin (Department of Industrial Engineering, Seoul National University)
  • Published : 2006.06.30

Abstract

Large signal type data sets are difficult to classify, especially if the data sets are non-stationary. In this paper, large signal type and non-stationary data sets are wavelet transformed so that distinct features of the data are extracted in wavelet domain rather than time domain. For the classification of the data, a few wavelet coefficients representing class properties are employed for statistical classification methods : Linear Discriminant Analysis, Quadratic Discriminant Analysis, Neural Network etc. The application of our wavelet-based feature selection method to a mass spectrometry data set for ovarian cancer diagnosis resulted in 100% classification accuracy.

Keywords

References

  1. Alexe, G., Alexe, S., Liotta, L. A., Emanuel, P., Reiss, M., and Hammer, P. L. (2004), Ovarian cancer detection by logical analysis of proteomic data, Proteomis, 4, 766-783 https://doi.org/10.1002/pmic.200300574
  2. Amato, U. and Sapatinas, T. (2005), Wavelet shrinkage approaches to baseline signal estimation from repeated noisy measurements, Advances and Applications in Statistics, 5, 21-50
  3. Hastie, T., Tibshirani, R., and Friedman, J. (2001), The elements of statistical learning, Springer, USA
  4. Jeong, M. K., Chen, D., Lu, J. C. (2003), Thresholded scalogram and its applications in process fault detection, Applied stochastic models in business and industry, 19(3), 231-244 https://doi.org/10.1002/asmb.495
  5. Jung, U. (2004), Wavelet-based data reduction and mining for multiple functional data, Ph.D. dissertation, Georgia Institute of Technology, USA
  6. Lada, E. K., Lu, J. C., and Willson, J. R. (2002), A wavelet- based procedure for process fault detection, IEEE Transactions on Semiconductor Manufacturing, 15(1), 79-90 https://doi.org/10.1109/66.983447
  7. Raimondo, M. (2002), Wavelet shrinkage via peaks over threshold, Intersat, May, 1-19
  8. Tibshirani, R., Hastie, T., Narasimhan, B., Soltys, S., Shi, G., Koong, A., and Le, Q. (2004), Sample classification from protein mass spectrometry by peak probability contrasts, Bioinformatics, 20(17), 3034-3044 https://doi.org/10.1093/bioinformatics/bth357
  9. Vannucci, M., Sha, N., Brown, J. P. (2005), NIR and mass spectra classification: Bayesian methods for wavelet-based feature selection, Chemometrics and intelligent laboratory systems, 77(1/2), 139-148 https://doi.org/10.1016/j.chemolab.2004.10.009
  10. Wu, B., Abbott, T., Fishman, D., McMurray, W., Mor, G., Stone, K., Ward, D., Williams, K., and Zhao, H. (2003), Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data, Bioinformatics, 19(13), 1636-1643 https://doi.org/10.1093/bioinformatics/btg210