• Title/Summary/Keyword: 앙상블 학습 기법

Search Result 91, Processing Time 0.036 seconds

Model Ensemble for Accurate Pig Detection under Strong Illumination Condition (강한 조명하에서 정확한 돼지 탐지를 위한 모델 앙상블)

  • Son, Seungwook;Ahn, Hanse;Lee, Nayeon;An, Yunho;Chung, Yongwha;Park, Daihee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.05a
    • /
    • pp.385-388
    • /
    • 2021
  • CNN 기반 객체 탐지기의 발전으로 돈사에서 돼지 모니터링이 가능하지만, 실제 농가에서 적용하기 위해서는 영상에서 돈사의 조명에 직접 노출된 돼지들이 노출 과다 현상에 의해 탐지되지 않는 문제가 여전히 남아있다. 이러한 문제점은 싱글 모델로서는 정확도 개선의 한계가 있어, 복수개의 모델을 이용한 모델 앙상블 기법을 제안한다. 특히 본 연구에서 제안하는 영상 처리 기법을 사용하여 생성된 상호 보안적인 데이터를 통해 학습된 두 개의 TinyYOLOv4 모델을 결합하면, 돼지 객체 탐지의 정확도가 하나의 TinyYOLOv4 모델에 비하여 획기적으로 개선되었음을 확인하였다.

Indoor positioning method using WiFi signal based on XGboost (XGboost 기반의 WiFi 신호를 이용한 실내 측위 기법)

  • Hwang, Chi-Gon;Yoon, Chang-Pyo;Kim, Dae-Jin
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.1
    • /
    • pp.70-75
    • /
    • 2022
  • Accurately measuring location is necessary to provide a variety of services. The data for indoor positioning measures the RSSI values from the WiFi device through an application of a smartphone. The measured data becomes the raw data of machine learning. The feature data is the measured RSSI value, and the label is the name of the space for the measured position. For this purpose, the machine learning technique is to study a technique that predicts the exact location only with the WiFi signal by applying an efficient technique to classification. Ensemble is a technique for obtaining more accurate predictions through various models than one model, including backing and boosting. Among them, Boosting is a technique for adjusting the weight of a model through a modeling result based on sampled data, and there are various algorithms. This study uses Xgboost among the above techniques and evaluates performance with other ensemble techniques.

Logistic Regression Ensemble Method for Extracting Significant Information from Social Texts (소셜 텍스트의 주요 정보 추출을 위한 로지스틱 회귀 앙상블 기법)

  • Kim, So Hyeon;Kim, Han Joon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.5
    • /
    • pp.279-284
    • /
    • 2017
  • Currenty, in the era of big data, text mining and opinion mining have been used in many domains, and one of their most important research issues is to extract significant information from social media. Thus in this paper, we propose a logistic regression ensemble method of finding the main body text from blog HTML. First, we extract structural features and text features from blog HTML tags. Then we construct a classification model with logistic regression and ensemble that can decide whether any given tags involve main body text or not. One of our important findings is that the main body text can be found through 'depth' features extracted from HTML tags. In our experiment using diverse topics of blog data collected from the web, our tag classification model achieved 99% in terms of accuracy, and it recalled 80.5% of documents that have tags involving the main body text.

An Ensemble Method for Latent Interest Reasoning of Mobile Users (모바일 사용자의 잠재 관심 추론을 위한 앙상블 기법)

  • Choi, Yerim;Park, Jonghun;Shin, Dong Wan
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.11
    • /
    • pp.706-712
    • /
    • 2015
  • These days, much information is provided as a list of summaries through mobile services. In this regard, users consume information in which they are interested by observing the list and not by expressing their interest explicitly or implicitly through rating content or clicking links. Therefore, to appropriately model a user's interest, it is necessary to detect latent interest content. In this study, we propose a method for reasoning latent interest of a user by analyzing mobile content consumption logs of the user. Specifically, since erroneous reasoning will drastically degrade service quality, a unanimity ensemble method is adopted to maximize precision. In this method, an item is determined as the subject of latent interest only when multiple classifiers considering various aspects of the log unanimously agree. Accurate reasoning of latent interest will contribute to enhancing the quality of personalized services such as interest-based recommendation systems.

Estimation of bubble size distribution using deep ensemble physics-informed neural network (딥앙상블 물리 정보 신경망을 이용한 기포 크기 분포 추정)

  • Sunyoung Ko;Geunhwan Kim;Jaehyuk Lee;Hongju Gu;Kwangho Moon;Youngmin Choo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.42 no.4
    • /
    • pp.305-312
    • /
    • 2023
  • Physics-Informed Neural Network (PINN) is used to invert bubble size distributions from attenuation losses. By considering a linear system for the bubble population inversion, Adaptive Learned Iterative Shrinkage Thresholding Algorithm (Ada-LISTA), which has been solved linear systems in image processing, is used as a neural network architecture in PINN. Furthermore, a regularization based on the linear system is added to a loss function of PINN and it makes a PINN have better generalization by a solution satisfying the bubble physics. To evaluate an uncertainty of bubble estimation, deep ensemble is adopted. 20 Ada-LISTAs with different initial values are trained using the same training dataset. During test with attenuation losses different from those in the training dataset, the bubble size distribution and corresponding uncertainty are indicated by average and variance of 20 estimations, respectively. Deep ensemble Ada-LISTA demonstrate superior performance in inverting bubble size distributions than the conventional convex optimization solver of CVX.

A Study on Leakage Detection Technique Using Transfer Learning-Based Feature Fusion (전이학습 기반 특징융합을 이용한 누출판별 기법 연구)

  • YuJin Han;Tae-Jin Park;Jonghyuk Lee;Ji-Hoon Bae
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.2
    • /
    • pp.41-47
    • /
    • 2024
  • When there were disparities in performance between models trained in the time and frequency domains, even after conducting an ensemble, we observed that the performance of the ensemble was compromised due to imbalances in the individual model performances. Therefore, this paper proposes a leakage detection technique to enhance the accuracy of pipeline leakage detection through a step-wise learning approach that extracts features from both the time and frequency domains and integrates them. This method involves a two-step learning process. In the Stage 1, independent model training is conducted in the time and frequency domains to effectively extract crucial features from the provided data in each domain. In Stage 2, the pre-trained models were utilized by removing their respective classifiers. Subsequently, the features from both domains were fused, and a new classifier was added for retraining. The proposed transfer learning-based feature fusion technique in this paper performs model training by integrating features extracted from the time and frequency domains. This integration exploits the complementary nature of features from both domains, allowing the model to leverage diverse information. As a result, it achieved a high accuracy of 99.88%, demonstrating outstanding performance in pipeline leakage detection.

Optimal Selection of Classifier Ensemble Using Genetic Algorithms (유전자 알고리즘을 이용한 분류자 앙상블의 최적 선택)

  • Kim, Myung-Jong
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.4
    • /
    • pp.99-112
    • /
    • 2010
  • Ensemble learning is a method for improving the performance of classification and prediction algorithms. It is a method for finding a highly accurateclassifier on the training set by constructing and combining an ensemble of weak classifiers, each of which needs only to be moderately accurate on the training set. Ensemble learning has received considerable attention from machine learning and artificial intelligence fields because of its remarkable performance improvement and flexible integration with the traditional learning algorithms such as decision tree (DT), neural networks (NN), and SVM, etc. In those researches, all of DT ensemble studies have demonstrated impressive improvements in the generalization behavior of DT, while NN and SVM ensemble studies have not shown remarkable performance as shown in DT ensembles. Recently, several works have reported that the performance of ensemble can be degraded where multiple classifiers of an ensemble are highly correlated with, and thereby result in multicollinearity problem, which leads to performance degradation of the ensemble. They have also proposed the differentiated learning strategies to cope with performance degradation problem. Hansen and Salamon (1990) insisted that it is necessary and sufficient for the performance enhancement of an ensemble that the ensemble should contain diverse classifiers. Breiman (1996) explored that ensemble learning can increase the performance of unstable learning algorithms, but does not show remarkable performance improvement on stable learning algorithms. Unstable learning algorithms such as decision tree learners are sensitive to the change of the training data, and thus small changes in the training data can yield large changes in the generated classifiers. Therefore, ensemble with unstable learning algorithms can guarantee some diversity among the classifiers. To the contrary, stable learning algorithms such as NN and SVM generate similar classifiers in spite of small changes of the training data, and thus the correlation among the resulting classifiers is very high. This high correlation results in multicollinearity problem, which leads to performance degradation of the ensemble. Kim,s work (2009) showedthe performance comparison in bankruptcy prediction on Korea firms using tradition prediction algorithms such as NN, DT, and SVM. It reports that stable learning algorithms such as NN and SVM have higher predictability than the unstable DT. Meanwhile, with respect to their ensemble learning, DT ensemble shows the more improved performance than NN and SVM ensemble. Further analysis with variance inflation factor (VIF) analysis empirically proves that performance degradation of ensemble is due to multicollinearity problem. It also proposes that optimization of ensemble is needed to cope with such a problem. This paper proposes a hybrid system for coverage optimization of NN ensemble (CO-NN) in order to improve the performance of NN ensemble. Coverage optimization is a technique of choosing a sub-ensemble from an original ensemble to guarantee the diversity of classifiers in coverage optimization process. CO-NN uses GA which has been widely used for various optimization problems to deal with the coverage optimization problem. The GA chromosomes for the coverage optimization are encoded into binary strings, each bit of which indicates individual classifier. The fitness function is defined as maximization of error reduction and a constraint of variance inflation factor (VIF), which is one of the generally used methods to measure multicollinearity, is added to insure the diversity of classifiers by removing high correlation among the classifiers. We use Microsoft Excel and the GAs software package called Evolver. Experiments on company failure prediction have shown that CO-NN is effectively applied in the stable performance enhancement of NNensembles through the choice of classifiers by considering the correlations of the ensemble. The classifiers which have the potential multicollinearity problem are removed by the coverage optimization process of CO-NN and thereby CO-NN has shown higher performance than a single NN classifier and NN ensemble at 1% significance level, and DT ensemble at 5% significance level. However, there remain further research issues. First, decision optimization process to find optimal combination function should be considered in further research. Secondly, various learning strategies to deal with data noise should be introduced in more advanced further researches in the future.

Place Recognition Using Ensemble Learning of Mobile Multimodal Sensory Information (모바일 멀티모달 센서 정보의 앙상블 학습을 이용한 장소 인식)

  • Lee, Chung-Yeon;Lee, Beom-Jin;On, Kyoung-Woon;Ha, Jung-Woo;Kim, Hong-Il;Zhang, Byoung-Tak
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.1
    • /
    • pp.64-69
    • /
    • 2015
  • Place awareness is an essential for location-based services that are widely provided to smartphone users. However, traditional GPS-based methods are only valid outdoors where the GPS signal is strong and also require symbolic place information of the physical location. In this paper, environmental sounds and images are used to recognize important aspects of each place. The proposed method extracts feature vectors from visual, auditory and location data recorded by a smartphone with built-in camera, microphone and GPS sensors modules. The heterogeneous feature vectors were then learned by an ensemble learning method that learns each group of feature vectors for each classifier respectively and votes to produce the highest weighted result. The proposed method is evaluated for place recognition using a data group of 3000 samples in six places and the experimental results show a remarkably improved recognition accuracy when using all kinds of sensory data comparing to results using data from a single sensor or audio-visual integrated data only.

Virtual Samples Generation Based on the Distriburion of Input Data (입력 데이터의 분포를 고려한 가상 샘플 생성)

  • 이봉기;임용업;조성준
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.10b
    • /
    • pp.302-304
    • /
    • 2000
  • 본 논문에서는 잡음 추가와 네트웍 앙상블을 이용하는 기법으로 최근에 제안된 가상 샘플 생성 방법(VSG:Virtual Sample Generation)을 개선하는 방법을 제안하고, 이를 대표적인 앙상블학습 알고리즘인 Bagging, Boosting과 비교한다. 기존의 가상 샘플 생성 방법에 기초하여 입력 데이터의 분포를 고려하여 가상 샘플을 생성하는 방법을 제안한다. 이 방법은 입력 분포의 밀도가 높은 곳에서 가장 샘플로 인한 과소 적합을 방지하고 밀도가 낮은 곳에서 가상 샘플로 인한 과도 적합을 방지하기 위한 것이다. 본 논문은 입력 데이터의 밀도를 추정하는 새로운 과정을 정리하고 입력 분포에 따라 적합한 가상 샘플을 생성하는 방법을 고안했다. 그리고 제안하는 방법의 일반화 성능 향상을 보이기 위해 여러 가지의 합성 데이터를 사용하여 실험을 하였고 이를 Bagging, Boosting, VSG의 성능과 비교하였다.

  • PDF

Development of Deep Learning Based Ensemble Land Cover Segmentation Algorithm Using Drone Aerial Images (드론 항공영상을 이용한 딥러닝 기반 앙상블 토지 피복 분할 알고리즘 개발)

  • Hae-Gwang Park;Seung-Ki Baek;Seung Hyun Jeong
    • Korean Journal of Remote Sensing
    • /
    • v.40 no.1
    • /
    • pp.71-80
    • /
    • 2024
  • In this study, a proposed ensemble learning technique aims to enhance the semantic segmentation performance of images captured by Unmanned Aerial Vehicles (UAVs). With the increasing use of UAVs in fields such as urban planning, there has been active development of techniques utilizing deep learning segmentation methods for land cover segmentation. The study suggests a method that utilizes prominent segmentation models, namely U-Net, DeepLabV3, and Fully Convolutional Network (FCN), to improve segmentation prediction performance. The proposed approach integrates training loss, validation accuracy, and class score of the three segmentation models to enhance overall prediction performance. The method was applied and evaluated on a land cover segmentation problem involving seven classes: buildings,roads, parking lots, fields, trees, empty spaces, and areas with unspecified labels, using images captured by UAVs. The performance of the ensemble model was evaluated by mean Intersection over Union (mIoU), and the results of comparing the proposed ensemble model with the three existing segmentation methods showed that mIoU performance was improved. Consequently, the study confirms that the proposed technique can enhance the performance of semantic segmentation models.