• Title/Summary/Keyword: ensemble method

Search Result 508, Processing Time 0.025 seconds

Predictive Analysis of Ethereum Uncle Block using Ensemble Machine Learning Technique and Blockchain Information (앙상블 머신러닝 기법과 블록체인 정보를 활용한 이더리움 엉클 블록 예측 분석)

  • Kim, Han-Min
    • Journal of Digital Convergence
    • /
    • v.18 no.11
    • /
    • pp.129-136
    • /
    • 2020
  • The advantages of Blockchain present the necessity of Blockchain in various fields. However, there are several disadvantages to Blockchain. Among them, the uncle block problem is one of the problems that can greatly hinder the value and utilization of Blockchain. Although the value of Blockchain may be degraded by the uncle block problem, previous studies did not pay much attention to research on uncle block. Therefore, the purpose of this study attempts to predict the occurrence of uncle block in order to predict and prepare for the uncle block problem of Blockchain. This study verifies the validity of introducing new attributes and ensemble analysis techniques for accurate prediction of uncle block occurrence. As a research method, voting, bagging, and stacking ensemble analysis techniques were employed for Ethereum's uncle block where the uncle block problem actually occurs. We used Blockchain information of Ethereum and Bitcoin as analysis data. As a result of the study, we found that the best prediction result was presented when voting and stacking ensemble techniques were applied using only Ethereum Blockchain information. The result of this study contributes to more accurately predict the occurrence of uncle block and prepare for the uncle block problem of Blockchain.

A Study of The Determinants of Turnover Intention and Organizational Commitment by Data Mining (데이터마이닝을 활용한 이직의도와 조직몰입의 결정요인에 대한 연구)

  • Choi, Young Joon;Shim, Won Shul;Baek, Seung Hyun
    • Journal of the Korea Society for Simulation
    • /
    • v.23 no.1
    • /
    • pp.21-31
    • /
    • 2014
  • In this article, data mining simulation is applied to find a proper approach and results of analysis for study of variables related to organization. Also, turnover intention and organizational commitment are used as target (dependent) variables in this simulation. Classification and regression tree (CART) with ensemble methods are used in this study for simulation. Human capital corporate panel data of Korea Research Institute for Vocation Education & Training (KRIVET) is used. The panel data is collected in 2005, 2007, and 2009. Organizational commitment variables are analyzed with combined measure variables which are created after investigation of reliability and single dimensionality for multiple-item measurement details. The results of this study are as follows. First, major determinants of turnover intention are trust, communication, and talent management-oriented trend. Second, the main determining factors for organizational commitment are trust, the number of years worked, innovation, communication. CART with ensemble methods has two ensemble CART methods which are CART with Bagging and CART with Arcing. Comparing two methods, CART with Arcing (Arc-x4) extracted scenarios with very high coefficients of determination. In this study, a scenario with maximum coefficient of determinant and minimum error is obtained and practical implications are presented. Using one of data mining methods, CART with ensemble method. Also, the limitation and future research are discussed.

A study on the measurement and characterization of tubulent flow inside an engine cylinder (엔진 실린더내 난류유동 측정과 정량화방법에 관한 연구)

  • 강건용;엄종호;김용선
    • Journal of the korean Society of Automotive Engineers
    • /
    • v.14 no.6
    • /
    • pp.39-47
    • /
    • 1992
  • The engine combustion is one of the most important process affecting performance and emissions. One effective way to improve the engine combustion is to control motion of the charge inside a cylinder by means of optimum induction system design, because the flame speed is mainly determined by the turbulence in a gasoline engine. This paper describes the measurement and characterization of mean velocity and turbulence intensity inside the cylinder of a 4-valve gasoline engine using laser Doppler velocimeter(LDV) under motoring(non-firing) conditions. Since the measured LDV data in each cycle show small cycle variation during compression stroke in the tested engine, the mean velocity and turbulence intensity are calculated by ensemble averaging method neglecting cycle variation effects. In the ensemble averaging method, the effects of the calculation window, in which velocities are assumed as the same crank angle, on mean velocity and turbulence intensity are fully investigated. In addition, the effects of measuring point on the flow characteristics are studied. With large calculation window, the mean velocity is shown to be less sensitive with respect to crank angle and turbulence intensity decrease in its absolute amplitude. When the piston approch to the top dead center of compression, the turbulence intensity is found to be homogeneous in the cylinder.

  • PDF

Parameter Tuning in Support Vector Regression for Large Scale Problems (대용량 자료에 대한 서포트 벡터 회귀에서 모수조절)

  • Ryu, Jee-Youl;Kwak, Minjung;Yoon, Min
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.25 no.1
    • /
    • pp.15-21
    • /
    • 2015
  • In support vector machine, the values of parameters included in kernels affect strongly generalization ability. It is often difficult to determine appropriate values of those parameters in advance. It has been observed through our studies that the burden for deciding the values of those parameters in support vector regression can be reduced by utilizing ensemble learning. However, the straightforward application of the method to large scale problems is too time consuming. In this paper, we propose a method in which the original data set is decomposed into a certain number of sub data set in order to reduce the burden for parameter tuning in support vector regression with large scale data sets and imbalanced data set, particularly.

A Suggestion for Data Assimilation Method of Hydrometeor Types Estimated from the Polarimetric Radar Observation

  • Yamaguchi, Kosei;Nakakita, Eiichi;Sumida, Yasuhiko
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2009.05a
    • /
    • pp.2161-2166
    • /
    • 2009
  • It is important for 0-6 hour nowcasting to provide for a high-quality initial condition in a meso-scale atmospheric model by a data assimilation of several observation data. The polarimetric radar data is expected to be assimilated into the forecast model, because the radar has a possibility of measurements of the types, the shapes, and the size distributions of hydrometeors. In this paper, an impact on rainfall prediction of the data assimilation of hydrometeor types (i.e. raindrop, graupel, snowflake, etc.) is evaluated. The observed information of hydrometeor types is estimated using the fuzzy logic algorism. As an implementation, the cloud-resolving nonhydrostatic atmospheric model, CReSS, which has detail microphysical processes, is employed as a forecast model. The local ensemble transform Kalman filter, LETKF, is used as a data assimilation method, which uses an ensemble of short-term forecasts to estimate the flowdependent background error covariance required in data assimilation. A heavy rainfall event occurred in Okinawa in 2008 is chosen as an application. As a result, the rainfall prediction accuracy in the assimilation case of both hydrometeor types and the Doppler velocity and the radar echo is improved by a comparison of the no assimilation case. The effects on rainfall prediction of the assimilation of hydrometeor types appear in longer prediction lead time compared with the effects of the assimilation of radar echo only.

  • PDF

Forecasting Day-ahead Electricity Price Using a Hybrid Improved Approach

  • Hu, Jian-Ming;Wang, Jian-Zhou
    • Journal of Electrical Engineering and Technology
    • /
    • v.12 no.6
    • /
    • pp.2166-2176
    • /
    • 2017
  • Electricity price prediction plays a crucial part in making the schedule and managing the risk to the competitive electricity market participants. However, it is a difficult and challenging task owing to the characteristics of the nonlinearity, non-stationarity and uncertainty of the price series. This study proposes a hybrid improved strategy which incorporates data preprocessor components and a forecasting engine component to enhance the forecasting accuracy of the electricity price. In the developed forecasting procedure, the Seasonal Adjustment (SA) method and the Ensemble Empirical Mode Decomposition (EEMD) technique are synthesized as the data preprocessing component; the Coupled Simulated Annealing (CSA) optimization method and the Least Square Support Vector Regression (LSSVR) algorithm construct the prediction engine. The proposed hybrid approach is verified with electricity price data sampled from the power market of New South Wales in Australia. The simulation outcome manifests that the proposed hybrid approach obtains the observable improvement in the forecasting accuracy compared with other approaches, which suggests that the proposed combinational approach occupies preferable predication ability and enough precision.

Multiscale approach to predict the effective elastic behavior of nanoparticle-reinforced polymer composites

  • Kim, B.R.;Pyo, S.H.;Lemaire, G.;Lee, H.K.
    • Interaction and multiscale mechanics
    • /
    • v.4 no.3
    • /
    • pp.173-185
    • /
    • 2011
  • A multiscale modeling scheme that addresses the influence of the nanoparticle size in nanocomposites consisting of nano-sized spherical particles embedded in a polymer matrix is presented. A micromechanics-based constitutive model for nanoparticle-reinforced polymer composites is derived by incorporating the Eshelby tensor considering the interface effects (Duan et al. 2005a) into the ensemble-volume average method (Ju and Chen 1994). A numerical investigation is carried out to validate the proposed micromechanics-based constitutive model, and a parametric study on the interface moduli is conducted to investigate the effect of interface moduli on the overall behavior of the composites. In addition, molecular dynamics (MD) simulations are performed to determine the mechanical properties of the nanoparticles and polymer. Finally, the overall elastic moduli of the nanoparticle-reinforced polymer composites are estimated using the proposed multiscale approach combining the ensemble-volume average method and the MD simulation. The predictive capability of the proposed multiscale approach has been demonstrated through the multiscale numerical simulations.

Developing efficient model updating approaches for different structural complexity - an ensemble learning and uncertainty quantifications

  • Lin, Guangwei;Zhang, Yi;Liao, Qinzhuo
    • Smart Structures and Systems
    • /
    • v.29 no.2
    • /
    • pp.321-336
    • /
    • 2022
  • Model uncertainty is a key factor that could influence the accuracy and reliability of numerical model-based analysis. It is necessary to acquire an appropriate updating approach which could search and determine the realistic model parameter values from measurements. In this paper, the Bayesian model updating theory combined with the transitional Markov chain Monte Carlo (TMCMC) method and K-means cluster analysis is utilized in the updating of the structural model parameters. Kriging and polynomial chaos expansion (PCE) are employed to generate surrogate models to reduce the computational burden in TMCMC. The selected updating approaches are applied to three structural examples with different complexity, including a two-storey frame, a ten-storey frame, and the national stadium model. These models stand for the low-dimensional linear model, the high-dimensional linear model, and the nonlinear model, respectively. The performances of updating in these three models are assessed in terms of the prediction uncertainty, numerical efforts, and prior information. This study also investigates the updating scenarios using the analytical approach and surrogate models. The uncertainty quantification in the Bayesian approach is further discussed to verify the validity and accuracy of the surrogate models. Finally, the advantages and limitations of the surrogate model-based updating approaches are discussed for different structural complexity. The possibility of utilizing the boosting algorithm as an ensemble learning method for improving the surrogate models is also presented.

Bankruptcy prediction using an improved bagging ensemble (개선된 배깅 앙상블을 활용한 기업부도예측)

  • Min, Sung-Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.121-139
    • /
    • 2014
  • Predicting corporate failure has been an important topic in accounting and finance. The costs associated with bankruptcy are high, so the accuracy of bankruptcy prediction is greatly important for financial institutions. Lots of researchers have dealt with the topic associated with bankruptcy prediction in the past three decades. The current research attempts to use ensemble models for improving the performance of bankruptcy prediction. Ensemble classification is to combine individually trained classifiers in order to gain more accurate prediction than individual models. Ensemble techniques are shown to be very useful for improving the generalization ability of the classifier. Bagging is the most commonly used methods for constructing ensemble classifiers. In bagging, the different training data subsets are randomly drawn with replacement from the original training dataset. Base classifiers are trained on the different bootstrap samples. Instance selection is to select critical instances while deleting and removing irrelevant and harmful instances from the original set. Instance selection and bagging are quite well known in data mining. However, few studies have dealt with the integration of instance selection and bagging. This study proposes an improved bagging ensemble based on instance selection using genetic algorithms (GA) for improving the performance of SVM. GA is an efficient optimization procedure based on the theory of natural selection and evolution. GA uses the idea of survival of the fittest by progressively accepting better solutions to the problems. GA searches by maintaining a population of solutions from which better solutions are created rather than making incremental changes to a single solution to the problem. The initial solution population is generated randomly and evolves into the next generation by genetic operators such as selection, crossover and mutation. The solutions coded by strings are evaluated by the fitness function. The proposed model consists of two phases: GA based Instance Selection and Instance based Bagging. In the first phase, GA is used to select optimal instance subset that is used as input data of bagging model. In this study, the chromosome is encoded as a form of binary string for the instance subset. In this phase, the population size was set to 100 while maximum number of generations was set to 150. We set the crossover rate and mutation rate to 0.7 and 0.1 respectively. We used the prediction accuracy of model as the fitness function of GA. SVM model is trained on training data set using the selected instance subset. The prediction accuracy of SVM model over test data set is used as fitness value in order to avoid overfitting. In the second phase, we used the optimal instance subset selected in the first phase as input data of bagging model. We used SVM model as base classifier for bagging ensemble. The majority voting scheme was used as a combining method in this study. This study applies the proposed model to the bankruptcy prediction problem using a real data set from Korean companies. The research data used in this study contains 1832 externally non-audited firms which filed for bankruptcy (916 cases) and non-bankruptcy (916 cases). Financial ratios categorized as stability, profitability, growth, activity and cash flow were investigated through literature review and basic statistical methods and we selected 8 financial ratios as the final input variables. We separated the whole data into three subsets as training, test and validation data set. In this study, we compared the proposed model with several comparative models including the simple individual SVM model, the simple bagging model and the instance selection based SVM model. The McNemar tests were used to examine whether the proposed model significantly outperforms the other models. The experimental results show that the proposed model outperforms the other models.

Korean Flood Vulnerability Assessment on Climate Change (기후변화에 따른 국내 홍수 취약성 평가)

  • Lee, Moon-Hwan;Jung, Il-Won;Bae, Deg-Hyo
    • Journal of Korea Water Resources Association
    • /
    • v.44 no.8
    • /
    • pp.653-666
    • /
    • 2011
  • The purposes of this study are to suggest flood vulnerability assessment method on climate change with evaluation of this method over the 5 river basins and to present the uncertainty range of assessment using multi-model ensemble scenarios. In this study, the data related to past historical flood events were collected and flood vulnerability index was calculated. The vulnerability assessment were also performed under current climate system. For future climate change scenario, the 39 climate scenarios are obtained from 3 different emission scenarios and 13 GCMs provided by IPCC DDC and 312 hydrology scenarios from 3 hydrological models and 2~3 potential evapotranspiration computation methods for the climate scenarios. Finally, the spatial and temporal changes of flood vulnerability and the range of uncertainty were performed for future S1 (2010~2039), S2 (2040~2069), S3 (2070~2099) period compared to reference S0 (1971~2000) period. The results of this study shows that vulnerable region's were Han and Sumjin, Youngsan river basins under current climate system. Considering the climate scenarios, variability in Nakdong, Gum and Han river basins are large, but Sumjin river basin had little variability due to low basic-stream ability to adaptation.