DOI QR코드

DOI QR Code

Separating Signals and Noises Using Mixture Model and Multiple Testing

혼합모델 및 다중 가설 검정을 이용한 신호와 잡음의 분류

  • Park, Hae-Sang (Department of Industrial and Management Engineering POSTECH) ;
  • Yoo, Si-Won (Customer Satisfaction Improvement Team, NHN) ;
  • Jun, Chi-Hyuck (Department of Industrial and Management Engineering, POSTECH)
  • 박해상 (포항공과대학교 산업경영공학과) ;
  • 유시원 (NHN 고객만족추진팀) ;
  • 전치혁 (포항공과대학교 산업경영공학과)
  • Published : 2009.08.31

Abstract

A problem of separating signals from noises is considered, when they are randomly mixed in the observation. It is assumed that the noise follows a Gaussian distribution and the signal follows a Gamma distribution, thus the underlying distribution of an observation will be a mixture of Gaussian and Gamma distributions. The parameters of the mixture model will be estimated from the EM algorithm. Then the signals and noises will be classified by a fixed threshold approach based on multiple testing using positive false discovery rate and Bayes error. The proposed method is applied to a real optical emission spectroscopy data for the quantitative analysis of inclusions. A simulation is carried out to compare the performance with the existing method using 3 sigma rule.

본 논문은 신호와 잡음이 혼합된 관측치로부터 신호 관측치를 분류하는 문제를 다룬다. 잡음은 가우시안 분포를 따르고 신호는 감마 분포를 따른다고 가정할 때 관측치의 분포는 가우시안과 감마의 혼합 분포를 따르게 된다. EM 알고리즘을 통해 혼합 모델의 모수를 추정하고 신호 및 잡음을 분류하는 것을 다중 가설 검정으로 간주하여 베이즈 오류를 바탕으로 분류를 위한 경계치를 설정한다. 제안하는 방법을 분광 데이터에 근거하여 철강 제품에서 개재물 유무를 검출하는 문제에 적용하였고 별도의 시뮬레이션 데이터를 통해 성능의 우수성을 보였다.

Keywords

References

  1. Abdi, H. (2007). Signal Detection Theory, In N.J. Salkind (Ed.): Encyclopedia of Measurement and Statistics, Thousand Oaks (CA): Sage
  2. Altman, D. and Bland, J. M. (1994). Statistics notes: Diagnostic tests 1: Sensitivity and specificity, British Medical Journal, 308, 1552 https://doi.org/10.1136/bmj.308.6943.1552
  3. Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing, Journal of the Royal Statistical Society: Series B (Methodological), 57, 289-300
  4. Bishop, M. C. (2006). Pattern Recognition and Machine Learning, Springer, New York
  5. Choi, S. C. and Wette, R. (1969). Maximum likelihood estimation of the parameters of the gamma distribution and their bias, Technometrics, 11, 683-690 https://doi.org/10.2307/1266892
  6. Chong, I. G. and Jun, C. H. (2005). Performance of some variable selection methods when multicollinearity is present, Chemometrics and Intelligent Laboratory Systems, 78, 103-112 https://doi.org/10.1016/j.chemolab.2004.12.011
  7. Hochberg, Y. and Tamhane, A. C. (1987). Multiple Comparison Procedures, Wiley, New York
  8. Kuss, H. M., l.iiengen, S., Mueller, G. and Thurmann, U. (2002). Comparison of spark OES methods for analysis of inclusions in iron base matters, Analytical and Bioanalytical Chemistry, 374, 1242-1249 https://doi.org/10.1007/s00216-002-1595-1
  9. Kuss, H. M., Mittelstaedt, H. and Muller, G. (2005). Inclusion mapping and estimation of inclusion contents in ferrous materials by fast scanning laser-induced optical emission spectrometry, Journal of Analytical Atomic Spectrometry, 20, 730-735 https://doi.org/10.1039/b503277f
  10. Shin, Y. and Bae, J. S. (2003). Rapid determination of cleanliness for steel by optical emission spectrometer, IEEE Instrumentation and Measurement Technology Conference (IMTC 2003), 2, 1583-1586
  11. Storey, J. D. (2002). A direct approach to false discovery rates, Journal of the Royal Statistical Society: Series B (Methodological), 64, 479-498 https://doi.org/10.1111/1467-9868.00346
  12. Storey, J. D. (2003). The positive false discovery rate: A Bayesian interpretation and the q-value, The Annals of Statistics, 31, 2013-2035 https://doi.org/10.1214/aos/1074290335