• Title/Summary/Keyword: Ensemble Voting

Search Result 50, Processing Time 0.022 seconds

Dynamic Web Information Predictive System Using Ensemble Support Vector Machine (앙상블 SVM을 이용한 동적 웹 정보 예측 시스템)

  • Park, Chang-Hee;Yoon, Kyung-Bae
    • The KIPS Transactions:PartB
    • /
    • v.11B no.4
    • /
    • pp.465-470
    • /
    • 2004
  • Web Information Predictive Systems have the restriction such as they need users profiles and visible feedback information for obtaining the necessary information. For overcoming this restrict, this study designed and implemented Dynamic Web Information Predictive System using Ensemble Support Vector Machine to be able to predict the web information and provide the relevant information every user needs most by click stream data and user feedback information, which have some clues based on the data. The result of performance test using Dynamic Web Information Predictive System using Ensemble Support Vector Machine against the existing Web Information Predictive System has preyed that this study s method is an excellence solution.

Rockfall Source Identification Using a Hybrid Gaussian Mixture-Ensemble Machine Learning Model and LiDAR Data

  • Fanos, Ali Mutar;Pradhan, Biswajeet;Mansor, Shattri;Yusoff, Zainuddin Md;Abdullah, Ahmad Fikri bin;Jung, Hyung-Sup
    • Korean Journal of Remote Sensing
    • /
    • v.35 no.1
    • /
    • pp.93-115
    • /
    • 2019
  • The availability of high-resolution laser scanning data and advanced machine learning algorithms has enabled an accurate potential rockfall source identification. However, the presence of other mass movements, such as landslides within the same region of interest, poses additional challenges to this task. Thus, this research presents a method based on an integration of Gaussian mixture model (GMM) and ensemble artificial neural network (bagging ANN [BANN]) for automatic detection of potential rockfall sources at Kinta Valley area, Malaysia. The GMM was utilised to determine slope angle thresholds of various geomorphological units. Different algorithms(ANN, support vector machine [SVM] and k nearest neighbour [kNN]) were individually tested with various ensemble models (bagging, voting and boosting). Grid search method was adopted to optimise the hyperparameters of the investigated base models. The proposed model achieves excellent results with success and prediction accuracies at 95% and 94%, respectively. In addition, this technique has achieved excellent accuracies (ROC = 95%) over other methods used. Moreover, the proposed model has achieved the optimal prediction accuracies (92%) on the basis of testing data, thereby indicating that the model can be generalised and replicated in different regions, and the proposed method can be applied to various landslide studies.

Ensemble Learning of Region Based Classifiers (지역 기반 분류기의 앙상블 학습)

  • Choi, Sung-Ha;Lee, Byung-Woo;Yang, Ji-Hoon
    • The KIPS Transactions:PartB
    • /
    • v.14B no.4
    • /
    • pp.303-310
    • /
    • 2007
  • In machine learning, the ensemble classifier that is a set of classifiers have been introduced for higher accuracy than individual classifiers. We propose a new ensemble learning method that employs a set of region based classifiers. To show the performance of the proposed method. we compared its performance with that of bagging and boosting, which ard existing ensemble methods. Since the distribution of data can be different in different regions in the feature space, we split the data and generate classifiers based on each region and apply a weighted voting among the classifiers. We used 11 data sets from the UCI Machine Learning Repository to compare the performance of our new ensemble method with that of individual classifiers as well as existing ensemble methods such as bagging and boosting. As a result, we found that our method produced improved performance, particularly when the base learner is Naive Bayes or SVM.

Predictive Analysis of Ethereum Uncle Block using Ensemble Machine Learning Technique and Blockchain Information (앙상블 머신러닝 기법과 블록체인 정보를 활용한 이더리움 엉클 블록 예측 분석)

  • Kim, Han-Min
    • Journal of Digital Convergence
    • /
    • v.18 no.11
    • /
    • pp.129-136
    • /
    • 2020
  • The advantages of Blockchain present the necessity of Blockchain in various fields. However, there are several disadvantages to Blockchain. Among them, the uncle block problem is one of the problems that can greatly hinder the value and utilization of Blockchain. Although the value of Blockchain may be degraded by the uncle block problem, previous studies did not pay much attention to research on uncle block. Therefore, the purpose of this study attempts to predict the occurrence of uncle block in order to predict and prepare for the uncle block problem of Blockchain. This study verifies the validity of introducing new attributes and ensemble analysis techniques for accurate prediction of uncle block occurrence. As a research method, voting, bagging, and stacking ensemble analysis techniques were employed for Ethereum's uncle block where the uncle block problem actually occurs. We used Blockchain information of Ethereum and Bitcoin as analysis data. As a result of the study, we found that the best prediction result was presented when voting and stacking ensemble techniques were applied using only Ethereum Blockchain information. The result of this study contributes to more accurately predict the occurrence of uncle block and prepare for the uncle block problem of Blockchain.

Chest CT Image Patch-Based CNN Classification and Visualization for Predicting Recurrence of Non-Small Cell Lung Cancer Patients (비소세포폐암 환자의 재발 예측을 위한 흉부 CT 영상 패치 기반 CNN 분류 및 시각화)

  • Ma, Serie;Ahn, Gahee;Hong, Helen
    • Journal of the Korea Computer Graphics Society
    • /
    • v.28 no.1
    • /
    • pp.1-9
    • /
    • 2022
  • Non-small cell lung cancer (NSCLC) accounts for a high proportion of 85% among all lung cancer and has a significantly higher mortality rate (22.7%) compared to other cancers. Therefore, it is very important to predict the prognosis after surgery in patients with non-small cell lung cancer. In this study, the types of preoperative chest CT image patches for non-small cell lung cancer patients with tumor as a region of interest are diversified into five types according to tumor-related information, and performance of single classifier model, ensemble classifier model with soft-voting method, and ensemble classifier model using 3 input channels for combination of three different patches using pre-trained ResNet and EfficientNet CNN networks are analyzed through misclassification cases and Grad-CAM visualization. As a result of the experiment, the ResNet152 single model and the EfficientNet-b7 single model trained on the peritumoral patch showed accuracy of 87.93% and 81.03%, respectively. In addition, ResNet152 ensemble model using the image, peritumoral, and shape-focused intratumoral patches which were placed in each input channels showed stable performance with an accuracy of 87.93%. Also, EfficientNet-b7 ensemble classifier model with soft-voting method using the image and peritumoral patches showed accuracy of 84.48%.

Modeling and Selecting Optimal Features for Machine Learning Based Detections of Android Malwares (머신러닝 기반 안드로이드 모바일 악성 앱의 최적 특징점 선정 및 모델링 방안 제안)

  • Lee, Kye Woong;Oh, Seung Taek;Yoon, Young
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.11
    • /
    • pp.427-432
    • /
    • 2019
  • In this paper, we propose three approaches to modeling Android malware. The first method involves human security experts for meticulously selecting feature sets. With the second approach, we choose 300 features with the highest importance among the top 99% features in terms of occurrence rate. The third approach is to combine multiple models and identify malware through weighted voting. In addition, we applied a novel method of eliminating permission information which used to be regarded as a critical factor for distinguishing malware. With our carefully generated feature sets and the weighted voting by the ensemble algorithm, we were able to reach the highest malware detection accuracy of 97.8%. We also verified that discarding the permission information lead to the improvement in terms of false positive and false negative rates.

Cognitive Impairment Prediction Model Using AutoML and Lifelog

  • Hyunchul Choi;Chiho Yoon;Sae Bom Lee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.11
    • /
    • pp.53-63
    • /
    • 2023
  • This study developed a cognitive impairment predictive model as one of the screening tests for preventing dementia in the elderly by using Automated Machine Learning(AutoML). We used 'Wearable lifelog data for high-risk dementia patients' of National Information Society Agency, then conducted using PyCaret 3.0.0 in the Google Colaboratory environment. This study analysis steps are as follows; first, selecting five models demonstrating excellent classification performance for the model development and lifelog data analysis. Next, using ensemble learning to integrate these models and assess their performance. It was found that Voting Classifier, Gradient Boosting Classifier, Extreme Gradient Boosting, Light Gradient Boosting Machine, Extra Trees Classifier, and Random Forest Classifier model showed high predictive performance in that order. This study findings, furthermore, emphasized on the the crucial importance of 'Average respiration per minute during sleep' and 'Average heart rate per minute during sleep' as the most critical feature variables for accurate predictions. Finally, these study results suggest that consideration of the possibility of using machine learning and lifelog as a means to more effectively manage and prevent cognitive impairment in the elderly.

Effect of Application of Ensemble Method on Machine Learning with Insufficient Training Set in Developing Automated English Essay Scoring System (영작문 자동채점 시스템 개발에서 학습데이터 부족 문제 해결을 위한 앙상블 기법 적용의 효과)

  • Lee, Gyoung Ho;Lee, Kong Joo
    • Journal of KIISE
    • /
    • v.42 no.9
    • /
    • pp.1124-1132
    • /
    • 2015
  • In order to train a supervised machine learning algorithm, it is necessary to have non-biased labels and a sufficient amount of training data. However, it is difficult to collect the required non-biased labels and a sufficient amount of training data to develop an automatic English Composition scoring system. In addition, an English writing assessment is carried out using a multi-faceted evaluation of the overall level of the answer. Therefore, it is difficult to choose an appropriate machine learning algorithm for such work. In this paper, we show that it is possible to alleviate these problems through ensemble learning. The results of the experiment indicate that the ensemble technique exhibited an overall performance that was better than that of other algorithms.

AutoML and CNN-based Soft-voting Ensemble Classification Model For Road Traffic Emerging Risk Detection (도로교통 이머징 리스크 탐지를 위한 AutoML과 CNN 기반 소프트 보팅 앙상블 분류 모델)

  • Jeon, Byeong-Uk;Kang, Ji-Soo;Chung, Kyungyong
    • Journal of Convergence for Information Technology
    • /
    • v.11 no.7
    • /
    • pp.14-20
    • /
    • 2021
  • Most accidents caused by road icing in winter lead to major accidents. Because it is difficult for the driver to detect the road icing in advance. In this work, we study how to accurately detect road traffic emerging risk using AutoML and CNN's ensemble model that use both structured and unstructured data. We train CNN-based road traffic emerging risk classification model using images that are unstructured data and AutoML-based road traffic emerging risk classification model using weather data that is structured data, respectively. After that the ensemble model is designed to complement the CNN-based classification model by inputting probability values derived from of each models. Through this, improves road traffic emerging risk classification performance and alerts drivers more accurately and quickly to enable safe driving.

Prediction on Busan's Gross Product and Employment of Major Industry with Logistic Regression and Machine Learning Model (로지스틱 회귀모형과 머신러닝 모형을 활용한 주요산업의 부산 지역총생산 및 고용 효과 예측)

  • Chae-Deug Yi
    • Korea Trade Review
    • /
    • v.47 no.2
    • /
    • pp.69-88
    • /
    • 2022
  • This paper aims to predict Busan's regional product and employment using the logistic regression models and machine learning models. The following are the main findings of the empirical analysis. First, the OLS regression model shows that the main industries such as electricity and electronics, machine and transport, and finance and insurance affect the Busan's income positively. Second, the binomial logistic regression models show that the Busan's strategic industries such as the future transport machinery, life-care, and smart marine industries contribute on the Busan's income in large order. Third, the multinomial logistic regression models show that the Korea's main industries such as the precise machinery, transport equipment, and machinery influence the Busan's economy positively. And Korea's exports and the depreciation can affect Busan's economy more positively at the higher employment level. Fourth, the voting ensemble model show the higher predictive power than artificial neural network model and support vector machine models. Furthermore, the gradient boosting model and the random forest show the higher predictive power than the voting model in large order.