• 제목/요약/키워드: ensemble technique

검색결과 216건 처리시간 0.026초

Ensemble Design of Machine Learning Technigues: Experimental Verification by Prediction of Drifter Trajectory (앙상블을 이용한 기계학습 기법의 설계: 뜰개 이동경로 예측을 통한 실험적 검증)

  • Lee, Chan-Jae;Kim, Yong-Hyuk
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • 제8권3호
    • /
    • pp.57-67
    • /
    • 2018
  • The ensemble is a unified approach used for getting better performance by using multiple algorithms in machine learning. In this paper, we introduce boosting and bagging, which have been widely used in ensemble techniques, and design a method using support vector regression, radial basis function network, Gaussian process, and multilayer perceptron. In addition, our experiment was performed by adding a recurrent neural network and MOHID numerical model. The drifter data used for our experimental verification consist of 683 observations in seven regions. The performance of our ensemble technique is verified by comparison with four algorithms each. As verification, mean absolute error was adapted. The presented methods are based on ensemble models using bagging, boosting, and machine learning. The error rate was calculated by assigning the equal weight value and different weight value to each unit model in ensemble. The ensemble model using machine learning showed 61.7% improvement compared to the average of four machine learning technique.

An Ensemble Cascading Extremely Randomized Trees Framework for Short-Term Traffic Flow Prediction

  • Zhang, Fan;Bai, Jing;Li, Xiaoyu;Pei, Changxing;Havyarimana, Vincent
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제13권4호
    • /
    • pp.1975-1988
    • /
    • 2019
  • Short-term traffic flow prediction plays an important role in intelligent transportation systems (ITS) in areas such as transportation management, traffic control and guidance. For short-term traffic flow regression predictions, the main challenge stems from the non-stationary property of traffic flow data. In this paper, we design an ensemble cascading prediction framework based on extremely randomized trees (extra-trees) using a boosting technique called EET to predict the short-term traffic flow under non-stationary environments. Extra-trees is a tree-based ensemble method. It essentially consists of strongly randomizing both the attribute and cut-point choices while splitting a tree node. This mechanism reduces the variance of the model and is, therefore, more suitable for traffic flow regression prediction in non-stationary environments. Moreover, the extra-trees algorithm uses boosting ensemble technique averaging to improve the predictive accuracy and control overfitting. To the best of our knowledge, this is the first time that extra-trees have been used as fundamental building blocks in boosting committee machines. The proposed approach involves predicting 5 min in advance using real-time traffic flow data in the context of inherently considering temporal and spatial correlations. Experiments demonstrate that the proposed method achieves higher accuracy and lower variance and computational complexity when compared to the existing methods.

A Comparison of Ensemble Methods Combining Resampling Techniques for Class Imbalanced Data (데이터 전처리와 앙상블 기법을 통한 불균형 데이터의 분류모형 비교 연구)

  • Leea, Hee-Jae;Lee, Sungim
    • The Korean Journal of Applied Statistics
    • /
    • 제27권3호
    • /
    • pp.357-371
    • /
    • 2014
  • There are many studies related to imbalanced data in which the class distribution is highly skewed. To address the problem of imbalanced data, previous studies deal with resampling techniques which correct the skewness of the class distribution in each sampled subset by using under-sampling, over-sampling or hybrid-sampling such as SMOTE. Ensemble methods have also alleviated the problem of class imbalanced data. In this paper, we compare around a dozen algorithms that combine the ensemble methods and resampling techniques based on simulated data sets generated by the Backbone model, which can handle the imbalance rate. The results on various real imbalanced data sets are also presented to compare the effectiveness of algorithms. As a result, we highly recommend the resampling technique combining ensemble methods for imbalanced data in which the proportion of the minority class is less than 10%. We also find that each ensemble method has a well-matched sampling technique. The algorithms which combine bagging or random forest ensembles with random undersampling tend to perform well; however, the boosting ensemble appears to perform better with over-sampling. All ensemble methods combined with SMOTE outperform in most situations.

Estimation of optimal runoff hydrograph using radar rainfall ensemble and blending technique of rainfall-runoff models (레이더 강우 앙상블과 유출 블랜딩 기법을 이용한 최적 유출 수문곡선 산정)

  • Lee, Myungjin;Kang, Narae;Kim, Jongsung;Kim, Hung Soo
    • Journal of Korea Water Resources Association
    • /
    • 제51권3호
    • /
    • pp.221-233
    • /
    • 2018
  • Recently, the flood damage by the localized heavy rainfall and typhoon have been frequently occurred due to the climate change. Accurate rainfall forecasting and flood runoff estimates are needed to reduce such damages. However, the uncertainties are involved in guage rainfall, radar rainfall, and the estimated runoff hydrograph from rainfall-runoff models. Therefore, the purpose of this study is to identify the uncertainty of rainfall by generating a probabilistic radar rainfall ensemble and confirm the uncertainties of hydrological models through the analysis of the simulated runoffs from the models. The blending technique is used to estimate a single integrated or an optimal runoff hydrograph by the simulated runoffs from multi rainfall-runoff models. The radar ensemble is underestimated due to the influence of rainfall intensity and topography and the uncertainty of the rainfall ensemble is large. From the study, it will be helpful to estimate and predict the accurate runoff to prepare for the disaster caused by heavy rainfall.

Minimization of Motion Artifact During Exercise in Impedance Cardiography (임피던스 심장기록법에서 운동으로 인한 Motion Artifact의 최소화)

  • Kim, Jung-Chan;Kim, Jeong-Yeol;Kim, Deok-Won;Youn, Dae-Hee
    • Proceedings of the KOSOMBE Conference
    • /
    • 대한의용생체공학회 1989년도 춘계학술대회
    • /
    • pp.71-73
    • /
    • 1989
  • The origins of the motion artifact resulting from exercise in impedance cardiography wore explained and the ensemble average technique was applied to reduce the motion artifact enabling the measurement of cardiac output during exercise. Algorithm for ensemble average was developed and applied to the actual impedance signals. It was found that the minimum number of sampling was 20, and sampling frequency was 500Hz. Using the ensemble average technique it was possible to measure cardiac output continuously during the treadmill exercise. Therefore it is hoped that this study may contribute in the area of exercise physiology and sport medicine.

  • PDF

Algorithm detecting an evoked potential using the ensemble averaged bispectrum (The ensemble averaged bispectrum을 이용한 유발전위 검출 알고리즘)

  • Choi, J.M.;Bae, B.H.;Kim, S.Y.
    • Proceedings of the KOSOMBE Conference
    • /
    • 대한의용생체공학회 1994년도 추계학술대회
    • /
    • pp.124-127
    • /
    • 1994
  • A technique based on bispectrun averaging is described for generally recovering the signal waveform from a set of noisy signals with variable signal delay. The technique does not require explicit tune alignment of signals and any initial estimate of signal. The new method is suggested and is compared with other methods. This method are numerically investigated using computer generated-data and a physiological signal and noise Some experimental results for the evoked potential studios that demonstrate the technique are given. The results show the effectiveness of the technique: various potential applications of the technique might be expected.

  • PDF

Noise Reduction Technique by Three-Points Ensemble Averaging in Uroflowmetry (삼점 신호 평균기법에 의한 요속신호의 잡음 축소 기법)

  • Choi, Seong-Su;Lee, In-Kwang;Lee, Sang-Bong;Park, Jun-Oh;Lee, Su-Ok;Cha, Eun-Jong;Kim, Kyung-Ah
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • 제58권8호
    • /
    • pp.1638-1643
    • /
    • 2009
  • Uroflowmetry is a convenient clinical test to screen the benign prostatic hyperplasia(BPH) common in the aged men. A load cell is located beneath the urine container to measure the weight of urine. However, it is sensitive to the impact applied on the bottom of the container by the urine stream, which could be a noise source lowering the reliability of the system. With this aim, our study proposed a noise reduction technique by computing ensemble average of the weighted signals that were acquired from three-load cells forming a regular triangle beneath the urine container. Simulated urination experiment was performed with three different collection methods, all of which demonstrated significant noise reduction by ensemble averaging. Furthermore, the best results can be obtained without any special urine collection devices. Thus, our novel method can be usefully applied to uroflowmetry for enhancing measurement in terms of accuracy and reliability.

Measurement of cardiac output during treadmill exercise by impedance cardiography with a new ensemble average (새로운 앙상블 평균법에 의한 임피던스 심장기록법의 트래드밀 운동 중의 심박출량 측정)

  • Kim, Deok-W.;Song, Chul-G.;Oh, In-S.;Hwang, Soo-K.;Kim, Won-K.
    • Proceedings of the KOSOMBE Conference
    • /
    • 대한의용생체공학회 1990년도 춘계학술대회
    • /
    • pp.7-8
    • /
    • 1990
  • In this study, a new ensemble average technique was developed to measure cardiac output during treadmill exercise. Each dZ/dt peak (C point) was used as a starting point for ensemble averaging, instead of conventionally used R wave of ECG in order to prevent the peak dZ/dt waveform from blurring. In ease of using R wave as a reference, time interval from R wave to the peak of dZ/dt varies for each heart beat. Stroke volume, heart rate, and cardiac output of five male were successfully measured with Balke protocol using the new ensemble average technique.

  • PDF

Sentiment analysis of online food product review using ensemble technique (앙상블 기법을 활용한 온라인 음식 상품 리뷰 감성 분석)

  • Kim, Han-Min;Park, Kyungbo
    • Journal of Digital Convergence
    • /
    • 제17권4호
    • /
    • pp.115-122
    • /
    • 2019
  • In the online marketplace, consumers are exposed to various products and freely express opinions. As consumer product reviews have a important effect on the success of online markets and other consumers, online market needs to accurately analyze the consumers' emotions about their products. Text mining, which is one of the data analysis techniques, can analyze the consumer's reviews on the products and efficiently manage the products. Previous studies have analyzed specific domains and less than 20,000 data, despite the different accuracy of the analysis results depending on the data domain and size. Further, there are few studies on additional factors that can improve the accuracy of analysis. This study analyzed 72,530 review data of food product domain that was not mainly covered in previous studies by using ensemble technique. We also examined the influence of summary review on improving accuracy of analysis. As a result of the study, this study found that Boosting ensemble technique has the highest accuracy of analysis. In addition, the summary review contributed to improving accuracy of the analysis.

Developing an Ensemble Classifier for Bankruptcy Prediction (부도 예측을 위한 앙상블 분류기 개발)

  • Min, Sung-Hwan
    • Journal of Korea Society of Industrial Information Systems
    • /
    • 제17권7호
    • /
    • pp.139-148
    • /
    • 2012
  • An ensemble of classifiers is to employ a set of individually trained classifiers and combine their predictions. It has been found that in most cases the ensembles produce more accurate predictions than the base classifiers. Combining outputs from multiple classifiers, known as ensemble learning, is one of the standard and most important techniques for improving classification accuracy in machine learning. An ensemble of classifiers is efficient only if the individual classifiers make decisions as diverse as possible. Bagging is the most popular method of ensemble learning to generate a diverse set of classifiers. Diversity in bagging is obtained by using different training sets. The different training data subsets are randomly drawn with replacement from the entire training dataset. The random subspace method is an ensemble construction technique using different attribute subsets. In the random subspace, the training dataset is also modified as in bagging. However, this modification is performed in the feature space. Bagging and random subspace are quite well known and popular ensemble algorithms. However, few studies have dealt with the integration of bagging and random subspace using SVM Classifiers, though there is a great potential for useful applications in this area. The focus of this paper is to propose methods for improving SVM performance using hybrid ensemble strategy for bankruptcy prediction. This paper applies the proposed ensemble model to the bankruptcy prediction problem using a real data set from Korean companies.