• 제목/요약/키워드: Ensemble Learning

검색결과 374건 처리시간 0.032초

Performance Improvement of Classifier by Combining Disjunctive Normal Form features

  • Min, Hyeon-Gyu;Kang, Dong-Joong
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제10권4호
    • /
    • pp.50-64
    • /
    • 2018
  • This paper describes a visual object detection approach utilizing ensemble based machine learning. Object detection methods employing 1D features have the benefit of fast calculation speed. However, for real image with complex background, detection accuracy and performance are degraded. In this paper, we propose an ensemble learning algorithm that combines a 1D feature classifier and 2D DNF (Disjunctive Normal Form) classifier to improve the object detection performance in a single input image. Also, to improve the computing efficiency and accuracy, we propose a feature selecting method to reduce the computing time and ensemble algorithm by combining the 1D features and 2D DNF features. In the verification experiments, we selected the Haar-like feature as the 1D image descriptor, and demonstrated the performance of the algorithm on a few datasets such as face and vehicle.

흉부 CT 영상에서 비소세포폐암 환자의 재발 예측을 위한 종양 내외부 영상 패치 기반 앙상블 학습 (Ensemble Learning Based on Tumor Internal and External Imaging Patch to Predict the Recurrence of Non-small Cell Lung Cancer Patients in Chest CT Image)

  • 이예슬;조아현;홍헬렌
    • 한국멀티미디어학회논문지
    • /
    • 제24권3호
    • /
    • pp.373-381
    • /
    • 2021
  • In this paper, we propose a classification model based on convolutional neural network(CNN) for predicting 2-year recurrence in non-small cell lung cancer(NSCLC) patients using preoperative chest CT images. Based on the region of interest(ROI) defined as the tumor internal and external area, the input images consist of an intratumoral patch, a peritumoral patch and a peritumoral texture patch focusing on the texture information of the peritumoral patch. Each patch is trained through AlexNet pretrained on ImageNet to explore the usefulness and performance of various patches. Additionally, ensemble learning of network trained with each patch analyzes the performance of different patch combination. Compared with all results, the ensemble model with intratumoral and peritumoral patches achieved the best performance (ACC=98.28%, Sensitivity=100%, NPV=100%).

Path Loss Prediction Using an Ensemble Learning Approach

  • Beom Kwon;Eonsu Noh
    • 한국컴퓨터정보학회논문지
    • /
    • 제29권2호
    • /
    • pp.1-12
    • /
    • 2024
  • 경로 손실(Path Loss)을 예측하는 것은 셀룰러 네트워크(Cellular Network)에서 기지국(Base Station) 의 설치 위치 선정 등 무선망 설계에 중요한 요인 중 하나다. 기존에는 기지국의 최적 설치 위치를 결정하기 위해 수많은 현장 테스트(Field Tests)를 통해 경로 손실 값을 측정했다. 따라서 측정에 많은 시간이 소요된다는 단점이 있었다. 이러한 문제를 해결하기 위해 본 연구에서는 머신러닝(Machine Learning, ML) 기반의 경로 손실 예측 방법을 제안한다. 특히, 경로 손실 예측 성능을 향상시키기 위해서 앙상블 학습(Ensemble Learning) 접근법을 적용하였다. 부트스트랩 데이터 세트(Bootstrap Dataset)을 활용하여 서로 다른 하이퍼파라미터(Hyperparameter) 구성을 갖는 모델들을 얻고, 이 모델들을 앙상블하여 최종 모델을 구축했다. 인터넷상에 공개된 경로 손실 데이터 세트를 활용하여 제안하는 앙상블 기반 경로 손실 예측 방법과 다양한 ML 기반 방법들의 성능을 평가 및 비교했다. 실험 결과, 제안하는 방법이 기존 방법들보다 우수한 성능을 달성하였으며, 경로 손실 값을 가장 정확하게 예측할 수 있다는 것을 입증하였다.

머신러닝을 활용한 모돈의 생산성 예측모델 (Forecasting Sow's Productivity using the Machine Learning Models)

  • 이민수;최영찬
    • 농촌지도와개발
    • /
    • 제16권4호
    • /
    • pp.939-965
    • /
    • 2009
  • The Machine Learning has been identified as a promising approach to knowledge-based system development. This study aims to examine the ability of machine learning techniques for farmer's decision making and to develop the reference model for using pig farm data. We compared five machine learning techniques: logistic regression, decision tree, artificial neural network, k-nearest neighbor, and ensemble. All models are well performed to predict the sow's productivity in all parity, showing over 87.6% predictability. The model predictability of total litter size are highest at 91.3% in third parity and decreasing as parity increases. The ensemble is well performed to predict the sow's productivity. The neural network and logistic regression is excellent classifier for all parity. The decision tree and the k-nearest neighbor was not good classifier for all parity. Performance of models varies over models used, showing up to 104% difference in lift values. Artificial Neural network and ensemble models have resulted in highest lift values implying best performance among models.

  • PDF

RapidEye 위성영상과 Semantic Segmentation 기반 딥러닝 모델을 이용한 토지피복분류의 정확도 평가 (Accuracy Assessment of Land-Use Land-Cover Classification Using Semantic Segmentation-Based Deep Learning Model and RapidEye Imagery)

  • 심우담;임종수;이정수
    • 대한원격탐사학회지
    • /
    • 제39권3호
    • /
    • pp.269-282
    • /
    • 2023
  • 본 연구는 딥러닝 모델(deep learning model)을 활용하여 토지피복분류를 수행하였으며 입력 이미지의 크기, Stride 적용 등 데이터세트(dataset)의 조절을 통해 토지피복분류를 위한 최적의 딥러닝 모델 선정을 목적으로 하였다. 적용한 딥러닝 모델은 3종류로 Encoder-Decoder 구조를 가진 U-net과 DeeplabV3+, 두 가지 모델을 결합한 앙상블(Ensemble) 모델을 활용하였다. 데이터세트는 RapidEye 위성영상을 입력영상으로, 라벨(label) 이미지는 Intergovernmental Panel on Climate Change 토지이용의 6가지 범주에 따라 구축한 Raster 이미지를 참값으로 활용하였다. 딥러닝 모델의 정확도 향상을 위해 데이터세트의 질적 향상 문제에 대해 주목하였으며 딥러닝 모델(U-net, DeeplabV3+, Ensemble), 입력 이미지 크기(64 × 64 pixel, 256 × 256 pixel), Stride 적용(50%, 100%) 조합을 통해 12가지 토지피복도를 구축하였다. 라벨 이미지와 딥러닝 모델 기반의 토지피복도의 정합성 평가결과, U-net과 DeeplabV3+ 모델의 전체 정확도는 각각 최대 약 87.9%와 89.8%, kappa 계수는 모두 약 72% 이상으로 높은 정확도를 보였으며, 64 × 64 pixel 크기의 데이터세트를 활용한 U-net 모델의 정확도가 가장 높았다. 또한 딥러닝 모델에 앙상블 및 Stride를 적용한 결과, 최대 약 3% 정확도가 상승하였으며 Semantic Segmentation 기반 딥러닝 모델의 단점인 경계간의 불일치가 개선됨을 확인하였다.

Ensemble Methods Applied to Classification Problem

  • Kim, ByungJoo
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제11권1호
    • /
    • pp.47-53
    • /
    • 2019
  • The idea of ensemble learning is to train multiple models, each with the objective to predict or classify a set of results. Most of the errors from a model's learning are from three main factors: variance, noise, and bias. By using ensemble methods, we're able to increase the stability of the final model and reduce the errors mentioned previously. By combining many models, we're able to reduce the variance, even when they are individually not great. In this paper we propose an ensemble model and applied it to classification problem. In iris, Pima indian diabeit and semiconductor fault detection problem, proposed model classifies well compared to traditional single classifier that is logistic regression, SVM and random forest.

An Ensemble Approach to Detect Fake News Spreaders on Twitter

  • Sarwar, Muhammad Nabeel;UlAmin, Riaz;Jabeen, Sidra
    • International Journal of Computer Science & Network Security
    • /
    • 제22권5호
    • /
    • pp.294-302
    • /
    • 2022
  • Detection of fake news is a complex and a challenging task. Generation of fake news is very hard to stop, only steps to control its circulation may help in minimizing its impacts. Humans tend to believe in misleading false information. Researcher started with social media sites to categorize in terms of real or fake news. False information misleads any individual or an organization that may cause of big failure and any financial loss. Automatic system for detection of false information circulating on social media is an emerging area of research. It is gaining attention of both industry and academia since US presidential elections 2016. Fake news has negative and severe effects on individuals and organizations elongating its hostile effects on the society. Prediction of fake news in timely manner is important. This research focuses on detection of fake news spreaders. In this context, overall, 6 models are developed during this research, trained and tested with dataset of PAN 2020. Four approaches N-gram based; user statistics-based models are trained with different values of hyper parameters. Extensive grid search with cross validation is applied in each machine learning model. In N-gram based models, out of numerous machine learning models this research focused on better results yielding algorithms, assessed by deep reading of state-of-the-art related work in the field. For better accuracy, author aimed at developing models using Random Forest, Logistic Regression, SVM, and XGBoost. All four machine learning algorithms were trained with cross validated grid search hyper parameters. Advantages of this research over previous work is user statistics-based model and then ensemble learning model. Which were designed in a way to help classifying Twitter users as fake news spreader or not with highest reliability. User statistical model used 17 features, on the basis of which it categorized a Twitter user as malicious. New dataset based on predictions of machine learning models was constructed. And then Three techniques of simple mean, logistic regression and random forest in combination with ensemble model is applied. Logistic regression combined in ensemble model gave best training and testing results, achieving an accuracy of 72%.

Optimizing SVM Ensembles Using Genetic Algorithms in Bankruptcy Prediction

  • Kim, Myoung-Jong;Kim, Hong-Bae;Kang, Dae-Ki
    • Journal of information and communication convergence engineering
    • /
    • 제8권4호
    • /
    • pp.370-376
    • /
    • 2010
  • Ensemble learning is a method for improving the performance of classification and prediction algorithms. However, its performance can be degraded due to multicollinearity problem where multiple classifiers of an ensemble are highly correlated with. This paper proposes genetic algorithm-based optimization techniques of SVM ensemble to solve multicollinearity problem. Empirical results with bankruptcy prediction on Korea firms indicate that the proposed optimization techniques can improve the performance of SVM ensemble.

A New Incremental Learning Algorithm with Probabilistic Weights Using Extended Data Expression

  • Yang, Kwangmo;Kolesnikova, Anastasiya;Lee, Won Don
    • Journal of information and communication convergence engineering
    • /
    • 제11권4호
    • /
    • pp.258-267
    • /
    • 2013
  • New incremental learning algorithm using extended data expression, based on probabilistic compounding, is presented in this paper. Incremental learning algorithm generates an ensemble of weak classifiers and compounds these classifiers to a strong classifier, using a weighted majority voting, to improve classification performance. We introduce new probabilistic weighted majority voting founded on extended data expression. In this case class distribution of the output is used to compound classifiers. UChoo, a decision tree classifier for extended data expression, is used as a base classifier, as it allows obtaining extended output expression that defines class distribution of the output. Extended data expression and UChoo classifier are powerful techniques in classification and rule refinement problem. In this paper extended data expression is applied to obtain probabilistic results with probabilistic majority voting. To show performance advantages, new algorithm is compared with Learn++, an incremental ensemble-based algorithm.

An Efficient Deep Learning Ensemble Using a Distribution of Label Embedding

  • Park, Saerom
    • 한국컴퓨터정보학회논문지
    • /
    • 제26권1호
    • /
    • pp.27-35
    • /
    • 2021
  • 본 연구에서는 레이블 임베딩의 분포를 반영하는 딥러닝 모형을 위한 새로운 스태킹 앙상블 방법론을 제안하였다. 제안된 앙상블 방법론은 기본 딥러닝 분류기를 학습하는 과정과 학습된 모형으로 부터 얻어진 레이블 임베딩을 이용한 군집화 결과로부터 소분류기들을 학습하는 과정으로 이루어져 있다. 본 방법론은 주어진 다중 분류 문제를 군집화 결과를 활용하여 소 문제들로 나누는 것을 기본으로 한다. 군집화에 사용되는 레이블 임베딩은 처음 학습한 기본 딥러닝 분류기의 마지막 층의 가중치로부터 얻어질 수 있다. 군집화 결과를 기반으로 군집화 내의 클래스들을 분류하는 소분류기들을 군집의 수만큼 구축하여 학습한다. 실험 결과 기본 분류기로부터의 레이블 임베딩이 클래스 간의 관계를 잘 반영한다는 것을 확인하였고, 이를 기반으로 한 앙상블 방법론이 CIFAR 100 데이터에 대해서 분류 성능을 향상시킬 수 있다는 것을 확인할 수 있었다.