• 제목/요약/키워드: Heterogeneous ensemble

검색결과 23건 처리시간 0.026초

Heterogeneous Ensemble of Classifiers from Under-Sampled and Over-Sampled Data for Imbalanced Data

  • Kang, Dae-Ki;Han, Min-gyu
    • International journal of advanced smart convergence
    • /
    • 제8권1호
    • /
    • pp.75-81
    • /
    • 2019
  • Data imbalance problem is common and causes serious problem in machine learning process. Sampling is one of the effective methods for solving data imbalance problem. Over-sampling increases the number of instances, so when over-sampling is applied in imbalanced data, it is applied to minority instances. Under-sampling reduces instances, which usually is performed on majority data. We apply under-sampling and over-sampling to imbalanced data and generate sampled data sets. From the generated data sets from sampling and original data set, we construct a heterogeneous ensemble of classifiers. We apply five different algorithms to the heterogeneous ensemble. Experimental results on an intrusion detection dataset as an imbalanced datasets show that our approach shows effective results.

The ensemble approach in comparison with the diverse feature selection techniques for estimating NPPs parameters using the different learning algorithms of the feed-forward neural network

  • Moshkbar-Bakhshayesh, Khalil
    • Nuclear Engineering and Technology
    • /
    • 제53권12호
    • /
    • pp.3944-3951
    • /
    • 2021
  • Several reasons such as no free lunch theorem indicate that there is not a universal Feature selection (FS) technique that outperforms other ones. Moreover, some approaches such as using synthetic dataset, in presence of large number of FS techniques, are very tedious and time consuming task. In this study to tackle the issue of dependency of estimation accuracy on the selected FS technique, a methodology based on the heterogeneous ensemble is proposed. The performance of the major learning algorithms of neural network (i.e. the FFNN-BR, the FFNN-LM) in combination with the diverse FS techniques (i.e. the NCA, the F-test, the Kendall's tau, the Pearson, the Spearman, and the Relief) and different combination techniques of the heterogeneous ensemble (i.e. the Min, the Median, the Arithmetic mean, and the Geometric mean) are considered. The target parameters/transients of Bushehr nuclear power plant (BNPP) are examined as the case study. The results show that the Min combination technique gives the more accurate estimation. Therefore, if the number of FS techniques is m and the number of learning algorithms is n, by the heterogeneous ensemble, the search space for acceptable estimation of the target parameters may be reduced from n × m to n × 1. The proposed methodology gives a simple and practical approach for more reliable and more accurate estimation of the target parameters compared to the methods such as the use of synthetic dataset or trial and error methods.

Transfer Learning-Based Feature Fusion Model for Classification of Maneuver Weapon Systems

  • Jinyong Hwang;You-Rak Choi;Tae-Jin Park;Ji-Hoon Bae
    • Journal of Information Processing Systems
    • /
    • 제19권5호
    • /
    • pp.673-687
    • /
    • 2023
  • Convolutional neural network-based deep learning technology is the most commonly used in image identification, but it requires large-scale data for training. Therefore, application in specific fields in which data acquisition is limited, such as in the military, may be challenging. In particular, the identification of ground weapon systems is a very important mission, and high identification accuracy is required. Accordingly, various studies have been conducted to achieve high performance using small-scale data. Among them, the ensemble method, which achieves excellent performance through the prediction average of the pre-trained models, is the most representative method; however, it requires considerable time and effort to find the optimal combination of ensemble models. In addition, there is a performance limitation in the prediction results obtained by using an ensemble method. Furthermore, it is difficult to obtain the ensemble effect using models with imbalanced classification accuracies. In this paper, we propose a transfer learning-based feature fusion technique for heterogeneous models that extracts and fuses features of pre-trained heterogeneous models and finally, fine-tunes hyperparameters of the fully connected layer to improve the classification accuracy. The experimental results of this study indicate that it is possible to overcome the limitations of the existing ensemble methods by improving the classification accuracy through feature fusion between heterogeneous models based on transfer learning.

대용량 이미지넷 인식을 위한 CNN 기반 Weighted 앙상블 기법 (CNN-based Weighted Ensemble Technique for ImageNet Classification)

  • 정희철;최민국;김준광;권순;정우영
    • 대한임베디드공학회논문지
    • /
    • 제15권4호
    • /
    • pp.197-204
    • /
    • 2020
  • The ImageNet dataset is a large scale dataset and contains various natural scene images. In this paper, we propose a convolutional neural network (CNN)-based weighted ensemble technique for the ImageNet classification task. First, in order to fuse several models, our technique uses weights for each model, unlike the existing average-based ensemble technique. Then we propose an algorithm that automatically finds the coefficients used in later ensemble process. Our algorithm sequentially selects the model with the best performance of the validation set, and then obtains a weight that improves performance when combined with existing selected models. We applied the proposed algorithm to a total of 13 heterogeneous models, and as a result, 5 models were selected. These selected models were combined with weights, and we achieved 3.297% Top-5 error rate on the ImageNet test dataset.

딥러닝 기반 BIM 부재 자동분류 학습모델의 성능 향상을 위한 Ensemble 모델 구축에 관한 연구 (Advanced Approach for Performance Improvement of Deep Learningbased BIM Elements Classification Model Using Ensemble Model)

  • 김시현;이원복;유영수;구본상
    • 한국BIM학회 논문집
    • /
    • 제12권2호
    • /
    • pp.12-25
    • /
    • 2022
  • To increase the usability of Building Information Modeling (BIM) in construction projects, it is critical to ensure the interoperability of data between heterogeneous BIM software. The Industry Foundation Classes (IFC), an international ISO format, has been established for this purpose, but due to its structural complexity, geometric information and properties are not always transmitted correctly. Recently, deep learning approaches have been used to learn the shapes of the BIM elements and thereby verify the mapping between BIM elements and IFC entities. These models performed well for elements with distinct shapes but were limited when their shapes were highly similar. This study proposed a method to improve the performance of the element type classification by using an Ensemble model that leverages not only shapes characteristics but also the relational information between individual BIM elements. The accuracy of the Ensemble model, which merges MVCNN and MLP, was improved 0.03 compared to the existing deep learning model that only learned shape information.

Adaptive boosting in ensembles for outlier detection: Base learner selection and fusion via local domain competence

  • Bii, Joash Kiprotich;Rimiru, Richard;Mwangi, Ronald Waweru
    • ETRI Journal
    • /
    • 제42권6호
    • /
    • pp.886-898
    • /
    • 2020
  • Unusual data patterns or outliers can be generated because of human errors, incorrect measurements, or malicious activities. Detecting outliers is a difficult task that requires complex ensembles. An ideal outlier detection ensemble should consider the strengths of individual base detectors while carefully combining their outputs to create a strong overall ensemble and achieve unbiased accuracy with minimal variance. Selecting and combining the outputs of dissimilar base learners is a challenging task. This paper proposes a model that utilizes heterogeneous base learners. It adaptively boosts the outcomes of preceding learners in the first phase by assigning weights and identifying high-performing learners based on their local domains, and then carefully fuses their outcomes in the second phase to improve overall accuracy. Experimental results from 10 benchmark datasets are used to train and test the proposed model. To investigate its accuracy in terms of separating outliers from inliers, the proposed model is tested and evaluated using accuracy metrics. The analyzed data are presented as crosstabs and percentages, followed by a descriptive method for synthesis and interpretation.

신용카드 불법현금융통 적발을 위한 축소된 앙상블 모형 (Illegal Cash Accommodation Detection Modeling Using Ensemble Size Reduction)

  • 이화경;한상범;지원철
    • 지능정보연구
    • /
    • 제16권1호
    • /
    • pp.93-116
    • /
    • 2010
  • 불법현금융통 적발모형 개발에 앙상블 접근방법을 사용하였다. 불법현금융통은 국내 신용카드사의 손익에 영향을 미치며 최근 국제화되고 있음에도 불구하고 학문적인 접근이 이루어지지 않았다. 부정행위 적발모형(Fraud Detection Model, FDM)은 데이터 불균형 문제로 인하여 좋은 성능을 얻기 어려운데, 다수의 모형을 결합하는 앙상블이 대안으로 제시되어 왔다. 앙상블에 포함된 모형들의 다양성이 보장된다면 단일모형에 비해 더 좋은 성능을 보인다는 점은 이미 인정되고 있으며, 최근 연구 결과는 학습된 모든 기본모형들을 사용하는 것보다 적절한 기본모형들만 선택하여 앙상블에 포함시키는 것이 바람직하다는 것이다. 본 논문에서는 효과적인 불법현금융통 적발을 위하여 축소된 앙상블 기법을 사용하는데, 정확성과 다양성 척도를 사용하여 앙상블에 참여할 기본모형을 선택하는 것이다. 다양성은 앙상블을 구성하는 기본모형들 사이의 불일치 (Disagreement or Ambiguity)를 의미하는데, FDM에 내재된 데이터 불균형문제를 고려하여 두 가지 측면에 중점을 두었다. 첫째, 학습 자료의 추출 과정에서 다양성을 확보하기 위한 소수 범주의 과잉추출 방법과 적절한 훈련 방법에 대해 설명하였다. 둘째, 소수범주에 초점을 맞추어 기존의 다양성 척도를 효과적인 척도로 변형시키고, 전진추가법과 후진소거법의 동적 다양성 계산법을 도입하여 앙상블에 참여할 기본모형을 평가하였다. 실험에 사용된 학습 알고리즘은 신경망, 의사결정수와 로짓 회귀분석이었으며, 동질적 앙상블과 이질적 앙상블을 구성하여 성능평가를 하였다. 실험결과 불법현금융통 적발모형에 있어 축소된 앙상블은 모든 기본모형이 포함된 앙상블과 성능 차이가 없었다. 축소된 앙상블은 앙상블 구성의 복잡성을 감소시키고 구현을 용이하게 한다는 점에서 FDM에서도 유력한 모형 수립 접근방법이 될 수 있음을 보였다.

Intrusion Detection using Attribute Subset Selector Bagging (ASUB) to Handle Imbalance and Noise

  • Priya, A.Sagaya;Kumar, S.Britto Ramesh
    • International Journal of Computer Science & Network Security
    • /
    • 제22권5호
    • /
    • pp.97-102
    • /
    • 2022
  • Network intrusion detection is becoming an increasing necessity for both organizations and individuals alike. Detecting intrusions is one of the major components that aims to prevent information compromise. Automated systems have been put to use due to the voluminous nature of the domain. The major challenge for automated models is the noise and data imbalance components contained in the network transactions. This work proposes an ensemble model, Attribute Subset Selector Bagging (ASUB) that can be used to effectively handle noise and data imbalance. The proposed model performs attribute subset based bag creation, leading to reduction of the influence of the noise factor. The constructed bagging model is heterogeneous in nature, hence leading to effective imbalance handling. Experiments were conducted on the standard intrusion detection datasets KDD CUP 99, Koyoto 2006 and NSL KDD. Results show effective performances, showing the high performance of the model.

이종 모델간 앙상블을 이용한 수소충전소 다이어프램 압축기 고장 진단에 관한 연구 (A study on diagnosis of failure of hydrogen refueling station diaphragm compressor using heterogeneous model ensemble)

  • 홍영우;김성은;신덕식;유동영
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2023년도 추계학술발표대회
    • /
    • pp.681-684
    • /
    • 2023
  • 우리나라의 수소연료전지 차량의 점유율이 매년 증가하고 있으나, 수소충전소 설비의 잦은 중단으로 수소연료전지 차량 운전자들이 제때 차량을 충전하지 못하는 불편이 발생하고 있다. 본 논문에서는 수소충전소 설비 중 Diaphragm을 사용하는 압축기의 이상 패턴을 탐지하는 Ensemble 모델을 통해 수소충전소에서 2023년 1월 1일부터 2023년 6월 28일 동안 수집된 데이터를 분석하였으며, 해당 기간 동안 발생했던 고장에 대해 2일전부터 이상 패턴이 10,000 이상 탐지되는 결과를 얻었다.

비균질체 의 역학적 및 열적 거동 에 관한 기초해석 (Fundamental Analysis for the Thermomechanical Behavior of Heterogeneous Media)

  • 박진무
    • 대한기계학회논문집
    • /
    • 제8권6호
    • /
    • pp.599-603
    • /
    • 1984
  • 본 연구에서는 직접평균법의 공간, 시간 및 ensemble 평균방법들 대신 공간변 수에 관한 convolution의 일관된 방법을 사용하며, 점근해석법의 범함수해석보다 기초 적인 방법으로, 혼합연속체이론의 등가균질체(equivalent homogeneous body)와 같은 추상적 모델 대신 혼합체의 자연적 모델에서 출발한다. 또 참고문헌(11)의 평균방법 인 4중적분의 연산대신 convolution과 체적 distribution을 조합하여 비균질체 일반의 유효장 방정식(effective field equation)들을 도출하고 검토한다.