• Title/Summary/Keyword: time series cross-validation

Search Result 29, Processing Time 0.03 seconds

The Optimal Hydrologic Forecasting System for Abnormal Storm due to Climate Change in the River Basin (하천유역에서 기후변화에 따른 이상호우시의 최적 수문예측시스템)

  • Kim, Seong-Won;Kim, Hyeong-Su
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2008.05a
    • /
    • pp.2193-2196
    • /
    • 2008
  • In this study, the new methodology such as support vector machines neural networks model (SVM-NNM) using the statistical learning theory is introduced to forecast flood stage in Nakdong river, Republic of Korea. The SVM-NNM in hydrologic time series forecasting is relatively new, and it is more problematic in comparison with classification. And, the multilayer perceptron neural networks model (MLP-NNM) is introduced as the reference neural networks model to compare the performance of SVM-NNM. And, for the performances of the neural networks models, they are composed of training, cross validation, and testing data, respectively. From this research, we evaluate the impact of the SVM-NNM and the MLP-NNM for the forecasting of the hydrologic time series in Nakdong river. Furthermore, we can suggest the new methodology to forecast the flood stage and construct the optimal forecasting system in Nakdong river, Republic of Korea.

  • PDF

Exploring COVID-19 in mainland China during the lockdown of Wuhan via functional data analysis

  • Li, Xing;Zhang, Panpan;Feng, Qunqiang
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.1
    • /
    • pp.103-125
    • /
    • 2022
  • In this paper, we analyze the time series data of the case and death counts of COVID-19 that broke out in China in December, 2019. The study period is during the lockdown of Wuhan. We exploit functional data analysis methods to analyze the collected time series data. The analysis is divided into three parts. First, the functional principal component analysis is conducted to investigate the modes of variation. Second, we carry out the functional canonical correlation analysis to explore the relationship between confirmed and death cases. Finally, we utilize a clustering method based on the Expectation-Maximization (EM) algorithm to run the cluster analysis on the counts of confirmed cases, where the number of clusters is determined via a cross-validation approach. Besides, we compare the clustering results with some migration data available to the public.

Reliability Computation of Neuro-Fuzzy Models : A Comparative Study (뉴로-퍼지 모델의 신뢰도 계산 : 비교 연구)

  • 심현정;박래정;왕보현
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.11 no.4
    • /
    • pp.293-301
    • /
    • 2001
  • This paper reviews three methods to compute a pointwise confidence interval of neuro-fuzzy models and compares their estimation perfonnanee through simulations. The eOITl.putation methods under consideration include stacked generalization using cross-validation, predictive error bar in regressive models, and local reliability measure for the networks employing a local representation scheme. These methods implemented on the neuro-fuzzy models are applied to the problems of simple function approximation and chaotic time series prediction. The results of reliability estimation are compared both quantitatively and qualitatively.

  • PDF

The Cross-validation of Satellite OMI and OMPS Total Ozone with Pandora Measurement (지상 Pandora와 위성 OMI와 OMPS 오존관측 자료의 상호검증 방법에 대한 분석 연구)

  • Baek, Kanghyun;Kim, Jae-Hwan;Kim, Jhoon
    • Korean Journal of Remote Sensing
    • /
    • v.36 no.3
    • /
    • pp.461-474
    • /
    • 2020
  • Korea launched Geostationary Environmental Monitoring Satellite (GEMS), a UV/visible spectrometer that measure pollution gases on 18 February 2020. Because satellite retrieval is an ill-posed inverse solving process, the validation with ground-based measurements or other satellite measurements is essential to obtain reliable products. For this purpose, satellite-based OMI and OMPS total column ozone (TCO), and ground-based Pandora TCO in Busan and Seoul were selected for future GEMS validation. First of all, the goal of this study is to validate the ground ozone data using characteristics that satellite data provide coherent ozone measurements on a global basis, although satellite data have a larger error than the ground-based measurements. In the cross validation between Pandora and OMI TCO, we have found abnormal deviation in ozone time series from Pandora #29 observed in Seoul. This shows that it is possible to perform inverse validation of ground data using satellite data. Then OMPS TCO was compared with verified Pandora TCO. Both data shows a correlation coefficient of 0.97, an RMSE of less than 2 DU and the OMPS-Pandora relative mean difference of >4%. The result also shows the OMPS-Pandora relative mean difference with SZA, TCO, cross-track position and season have insignificant dependence on those variables.In addition, we showed that appropriate thresholds depending on the spatial resolution of each satellite sensor are required to eliminate the impact of the cloud on Pandora TCO.

Classification of Emotional States of Interest and Neutral Using Features from Pulse Wave Signal

  • Phongsuphap, Sukanya;Sopharak, Akara
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2004.08a
    • /
    • pp.682-685
    • /
    • 2004
  • This paper investigated a method for classifying emotional states by using pulse wave signal. It focused on finding effective features for emotional state classification. The emptional states considered here consisted of interest and neutral. Classification experiments utilized 65 and 60 samples of interest and neutral states respectively. We have investigated 19 features derived from pulse wave signals by using both time domain and frequency domain analysis methods with 2 classifiers of minimum distance (normalized Euclidean distanece) and ${\kappa}$-Nearest Neighbour. The Leave-one-out cross validation was used as an evaluation mehtod. Based on experimental results, the most efficient features were a combination of 4 features consisting of (i) the mean of the first differences of the smoothed pulse rate time series signal, (ii) the mean of absolute values of the second differences of thel normalized interbeat intervals, (iii) the root mean square successive difference, and (iv) the power in high frequency range in normalized unit, which provided 80.8% average accuracy with ${\kappa}$-Nearest Neighbour classifier.

  • PDF

Vulnerability Assessment for Fine Particulate Matter (PM2.5) in the Schools of the Seoul Metropolitan Area, Korea: Part II - Vulnerability Assessment for PM2.5 in the Schools (인공지능을 이용한 수도권 학교 미세먼지 취약성 평가: Part II - 학교 미세먼지 범주화)

  • Son, Sanghun;Kim, Jinsoo
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.6_2
    • /
    • pp.1891-1900
    • /
    • 2021
  • Fine particulate matter (FPM; diameter ≤ 2.5 ㎛) is frequently found in metropolitan areas due to activities associated with rapid urbanization and population growth. Many adolescents spend a substantial amount of time at school where, for various reasons, FPM generated outdoors may flow into indoor areas. The aims of this study were to estimate FPM concentrations and categorize types of FPM in schools. Meteorological and chemical variables as well as satellite-based aerosol optical depth were analyzed as input data in a random forest model, which applied 10-fold cross validation and a grid-search method, to estimate school FPM concentrations, with four statistical indicators used to evaluate accuracy. Loose and strict standards were established to categorize types of FPM in schools. Under the former classification scheme, FPM in most schools was classified as type 2 or 3, whereas under strict standards, school FPM was mostly classified as type 3 or 4.

A Study on the Prediction of Power Consumption in the Air-Conditioning System by Using the Gaussian Process (정규 확률과정을 사용한 공조 시스템의 전력 소모량 예측에 관한 연구)

  • Lee, Chang-Yong;Song, Gensoo;Kim, Jinho
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.39 no.1
    • /
    • pp.64-72
    • /
    • 2016
  • In this paper, we utilize a Gaussian process to predict the power consumption in the air-conditioning system. As the power consumption in the air-conditioning system takes a form of a time-series and the prediction of the power consumption becomes very important from the perspective of the efficient energy management, it is worth to investigate the time-series model for the prediction of the power consumption. To this end, we apply the Gaussian process to predict the power consumption, in which the Gaussian process provides a prior probability to every possible function and higher probabilities are given to functions that are more likely consistent with the empirical data. We also discuss how to estimate the hyper-parameters, which are parameters in the covariance function of the Gaussian process model. We estimated the hyper-parameters with two different methods (marginal likelihood and leave-one-out cross validation) and obtained a model that pertinently describes the data and the results are more or less independent of the estimation method of hyper-parameters. We validated the prediction results by the error analysis of the mean relative error and the mean absolute error. The mean relative error analysis showed that about 3.4% of the predicted value came from the error, and the mean absolute error analysis confirmed that the error in within the standard deviation of the predicted value. We also adopt the non-parametric Wilcoxon's sign-rank test to assess the fitness of the proposed model and found that the null hypothesis of uniformity was accepted under the significance level of 5%. These results can be applied to a more elaborate control of the power consumption in the air-conditioning system.

Deep-learning based In-situ Monitoring and Prediction System for the Organic Light Emitting Diode

  • Park, Il-Hoo;Cho, Hyeran;Kim, Gyu-Tae
    • Journal of the Semiconductor & Display Technology
    • /
    • v.19 no.4
    • /
    • pp.126-129
    • /
    • 2020
  • We introduce a lifetime assessment technique using deep learning algorithm with complex electrical parameters such as resistivity, permittivity, impedance parameters as integrated indicators for predicting the degradation of the organic molecules. The evaluation system consists of fully automated in-situ measurement system and multiple layer perceptron learning system with five hidden layers and 1011 perceptra in each layer. Prediction accuracies are calculated and compared depending on the physical feature, learning hyperparameters. 62.5% of full time-series data are used for training and its prediction accuracy is estimated as r-square value of 0.99. Remaining 37.5% of the data are used for testing with prediction accuracy of 0.95. With k-fold cross-validation, the stability to the instantaneous changes in the measured data is also improved.

QSO Selections Using Time Variability and Machine Learning

  • Kim, Dae-Won;Protopapas, Pavlos;Byun, Yong-Ik;Alcock, Charles;Khardon, Roni
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.36 no.2
    • /
    • pp.64-64
    • /
    • 2011
  • We present a new quasi-stellar object (QSO) selection algorithm using a Support Vector Machine, a supervised classification method, on a set of extracted time series features including period, amplitude, color, and autocorrelation value. We train a model that separates QSOs from variable stars, non-variable stars, and microlensing events using 58 known QSOs, 1629 variable stars, and 4288 non-variables in the MAssive Compact Halo Object (MACHO) database as a training set. To estimate the efficiency and the accuracy of the model, we perform a cross-validation test using the training set. The test shows that the model correctly identifies ~80% of known QSOs with a 25% false-positive rate. The majority of the false positives are Be stars. We applied the trained model to the MACHO Large Magellanic Cloud (LMC) data set, which consists of 40 million lightcurves, and found 1620 QSO candidates. During the selection, none of the 33,242 known MACHO variables were misclassified as QSO candidates. In order to estimate the true false-positive rate, we crossmatched the candidates with astronomical catalogs including the Spitzer Surveying the Agents of a Galaxy's Evolution (SAGE) LMC catalog and a few X-ray catalogs. The results further suggest that the majority of the candidates, more than 70%, are QSOs.

  • PDF

Functional clustering for electricity demand data: A case study (시간단위 전력수요자료의 함수적 군집분석: 사례연구)

  • Yoon, Sanghoo;Choi, Youngjean
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.4
    • /
    • pp.885-894
    • /
    • 2015
  • It is necessary to forecast the electricity demand for reliable and effective operation of the power system. In this study, we try to categorize a functional data, the mean curve in accordance with the time of daily power demand pattern. The data were collected between January 1, 2009 and December 31, 2011. And it were converted to time series data consisting of seasonal components and error component through log transformation and removing trend. Functional clustering by Ma et al. (2006) are applied and parameters are estimated using EM algorithm and generalized cross validation. The number of clusters is determined by classifying holidays or weekdays. Monday, weekday (Tuesday to Friday), Saturday, Sunday or holiday and season are described the mean curve of daily power demand pattern.