DOI QR코드

DOI QR Code

어림과 나머지 성분을 이용한 연안 수온자료의 이상자료 감지

Outlier Detection of the Coastal Water Temperature Monitoring Data Using the Approximate and Detail Components

  • 조홍연 (한국해양연구원 해양환경보전연구부) ;
  • 오지희 (한국해양연구원 해양환경보전연구부)
  • Cho, Hong-Yeon (Marine Environment & Conservation Research Department, KORDI) ;
  • Oh, Ji-Hee (Marine Environment & Conservation Research Department, KORDI)
  • 투고 : 2012.01.19
  • 심사 : 2012.04.10
  • 발행 : 2012.05.25

초록

연안 환경모니터링 사업이 확대되면서 방대하게 축적되어 있는 연안 환경모니터링 자료의 통계적 분석을 위해서는 모니터링 자료에서 빈번하게 발생하는 이상 자료의 감지 처리가 우선적으로 필요하다. 본 연구에서는 연안 환경모니터링 자료의 어림성분과 나머지(또는 잔차)성분을 이용한 이상자료 진단기법을 제안하였다. 주기함수를 이용한 조화분석 방법과 국지 회귀함수추정 방법을 이용하여 각각 어림성분과 나머지성분을 추출한 후, 추출된 나머지성분 자료에 범용적인 Grubbs 검정기법 및 수정표본점수기법을 적용하여 이상자료를 진단 제거한 후 이상자료가 제거된 자료로 재구성하는 방법이다. 제안된 이 기법을 국립수산과학원 실시간어장정보시스템 제공하는 연안 수온 연속 모니터링 자료에 적용한 결과 이상자료가 성공적으로 제거되는 양상을 보이는 것으로 파악되었다.

Outlier detection and treatment process is highly required as the first step for the statistical analysis of the monitoring data having many outliers frequently occurred in the coastal environmental monitoring projects. In this study, the outlier detection method using the approximate and detail (or residual) components of the (raw) data is suggested. The approximate and detail components of the data can be separated by the diverse filtering and smoothing methods. The decomposition of the data is carried out by the harmonic analysis and local regression curve, respectively. Then, the Grubbs' test and modified z-score method widely used to detect outliers in the data are applied to the detail components of the water temperature data. The new data set is reconstructed after removed the outliers detected by these methods. It can be shown that the suggested process is successfully applied to the outlier detection of the coastal water temperature monitoring data provided by the Real-time Information System for Aquaculture Environment, National Fisheries Research and Development Institute (NFRDI).

키워드

참고문헌

  1. 국립수산과학원, 2012, 실시간 어장정보시스템. http://portal.nfrdi.re.kr/risa/.
  2. Agresti, A. and Franklin, C., 2007, Statistics, The Art and Science of Learning from Data, Pearson Education, Inc. pp.693.
  3. Barnett, V. and Lewis, T., 1994, Outliers in Statistical Data, Third Edition, John Wiley & Sons, Ltd., Chichester, UK, pp.584.
  4. Cho, H.Y., Suzuki, K. and Nakamura, Y., 2010, Hysteresis loop model for the estimation of the coastal water temperatures, -by using the buoy monitoring data in Mikawa Bay, Japan-, Report of the Port and Airport Research Institute, 49(2), pp.123-153.
  5. Dixon, W.J., 1950, Analysis of Extreme Values, The Annals of Mathematical Statistics, 21(4), pp.488-506. https://doi.org/10.1214/aoms/1177729747
  6. Garcia, F.A.A., 2010, Tests to identify outliers in data series, http://www.se.mathworks.com/matlabcentral/fileexchange/28501, MATLAB Central File Exchange. Retrieved January 19th, 2012.
  7. Grubbs, F.E., 1950, Sample Criteria for Testing Outlying Observations, The Annals of Mathematical Statistics, 21(1), pp.27-58. https://doi.org/10.1214/aoms/1177729885
  8. Hair, J.F. Jr., Black, W.C., Babin, B.J. and Anderson, R.E., 2010, Multivariate Data Analysis, A Global Perspective, Seventh Edition, Chapter 2, Pearson Education, Inc., New Jersey, USA, pp.800.
  9. Martinez, W.L. and Martinez, A.R., 2005, Exploratory Data Analysis with MATLAB, Computer Science and Data Analysis Series, Chapman & Hall/CRC. pp.405.
  10. Rousseeuw, P.J. and Leroy, A.M., 2003, Robust Regression and Outlier Detection, John Wiley & Sons. pp.329.

피인용 문헌

  1. Outlier Detection and Treatment for the Conversion of Chemical Oxygen Demand to Total Organic Carbon vol.26, pp.4, 2014, https://doi.org/10.9765/KSCOE.2014.26.4.207
  2. Efficient Outlier Detection of the Water Temperature Monitoring Data vol.26, pp.5, 2014, https://doi.org/10.9765/KSCOE.2014.26.5.285
  3. The Prediction of Water Temperature at Saemangeum Lake by Neural Network vol.27, pp.1, 2015, https://doi.org/10.9765/KSCOE.2015.27.1.56