• Title/Summary/Keyword: 앙상블 접근법

Search Result 17, Processing Time 0.024 seconds

Heterogeneous Clustering Ensemble Method using Evolutionary Approach with Different Cluster Results (다양한 클러스터 결과에 의해 진화적 접근법을 사용하는 이종 클러스터링 앙상블 기법)

  • Yoon Hye-Sung;Ahn Sun-Young;Lee Sang-Ho;Cho Sung-Bum;Kim Ju-Han
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.06a
    • /
    • pp.16-18
    • /
    • 2006
  • 데이터마이닝 기법의 클러스터링 알고리즘은 생물정보학에서 데이터 셋의 사전 정보를 고려하지 않고 중요한 유전적, 생물학적 상호작용을 찾기 위하여 적용되고 있다. 그러나 다양한 형식의 수많은 알고리즘들은 바이오데이터의 다양한 특성들과 실험의 가정 때문에 다른 클러스터링 결과들을 만들 수 있다. 본 논문에서는 바이오 데이터 셋의 특성에도 적합하면서 양질의 클러스터링 결과를 만들기 위한 새로운 방법을 제안한다. 이 방법은 여러 가지 클러스터링 알고리즘의 결과들을 유전자 알고리즘의 기본 개념인 진화적 환경에서 가장 적합한 형질을 선택하는 문제와 결합하였다. 그리고 실제 데이터 셋을 이용하여 우리의 제안하는 방법을 증명하고 실험 결과로 최적의 클러스터 결과를 보인다.

  • PDF

Future inflow projection based on Bayesian optimization for hyper-parameters (하이퍼매개변수 베이지안 최적화 기법을 적용한 미래 유입량 예측)

  • Tran, Trung Duc;Kim, Jongho
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2022.05a
    • /
    • pp.347-347
    • /
    • 2022
  • 최근 데이터 사이언스의 비약적인 발전과 함께 다양한 형태의 딥러닝 알고리즘이 개발되어 수자원 분야에도 적용되고 있다. 이 연구에서는 LSTM(Long Short-Term Memory) 네트워크와 BO-LSTM이라는 베이지안 최적화(BO) 기술을 결합하여 일단위 앙상블 미래 댐유입량을 projection하는 딥 러닝 모델을 제안하였다. BO-LSTM 하이퍼파라미터 및 손실 함수는 베이지안 최적화 기법을 통해 훈련 및 최적화되며, BO 접근법은 모델의 하이퍼파라미터와 손실 함수를 높은 정확도로 빠르게 최적화할 수 있었다(R=0.92 및 NSE=0.85). 또한 미래 댐 유입량을 예측하기 위한 LSTM의 구조는 Forecasting 모형과 Proiection 모형으로 구분하여 두 모형의 장단점을 분석하였으며, 본 연구의 결과로부터 데이터 처리 단계가 모델 훈련의 효율성을 높이고 노이즈를 줄이는 데 효과적이고 미래 예측에 있어 LSTM 구조에 따른 영향을 확인할 수 있었다. 본 연구는 소양강 유역, 2020-2100년 기간 동안의 미래 예측에 적용되었다. 전반적으로, CIMIP6 데이터에 따르면 10%에서 50%의 미래 유입량 증가가 발생하는 것으로 확인되었으며, 이는 미래 강수량의 증가의 폭과 유사함을 확인하였다. 유입량 산정에 있어 신뢰할 수 있는 예측은 저수지 운영, 계획 및 관리에 있어 정책 입안자와 운영자에게 도움이 될 것입니다.

  • PDF

Harnessing Deep Learning for Abnormal Respiratory Sound Detection (이상 호흡음 탐지를 위한 딥러닝 활용)

  • Gyurin Byun;Huigyu Yang;Hyunseung Choo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.641-643
    • /
    • 2023
  • Deep Learning(DL)을 사용한 호흡음의 자동 분석은 폐 질환의 조기 진단에 중추적인 역할을 한다. 그러나 현재의 DL 방법은 종종 호흡음의 공간적 및 시간적 특성을 분리하여 검사하기 때문에 한계가 있다. 본 연구는 컨볼루션 연산을 통해 공간적 특징을 캡처하고 시간 컨볼루션 네트워크를 사용하여 이러한 특징의 공간적-시간적 상관 관계를 활용하는 새로운 DL 프레임워크를 제한한다. 제안된 프레임워크는 앙상블 학습 접근법 내에 컨볼루션 네트워크를 통합하여 폐음 녹음에서 호흡 이상 및 질병을 검출하는 정확도를 크게 향상시킨다. 잘 알려진 ICBHI 2017 챌린지 데이터 세트에 대한 실험은 제안된 프레임워크가 호흡 이상 및 질병 검출을 위한 4-Class 작업에서 비교모델 성능보다 우수함을 보여준다. 특히 민감도와 특이도를 나타내는 점수 메트릭 측면에서 최대 45.91%와 14.1%의 개선이 이진 및 다중 클래스 호흡 이상 감지 작업에서 각각 보여준다. 이러한 결과는 기존 기술보다 우리 방법의 두드러진 이점을 강조하여 호흡기 의료 기술의 미래 혁신을 주도할 수 있는 잠재력을 보여준다.

Comparison between Uncertainties of Cultivar Parameter Estimates Obtained Using Error Calculation Methods for Forage Rice Cultivars (오차 계산 방식에 따른 사료용 벼 품종의 품종모수 추정치 불확도 비교)

  • Young Sang Joh;Shinwoo Hyun;Kwang Soo Kim
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.25 no.3
    • /
    • pp.129-141
    • /
    • 2023
  • Crop models have been used to predict yield under diverse environmental and cultivation conditions, which can be used to support decisions on the management of forage crop. Cultivar parameters are one of required inputs to crop models in order to represent genetic properties for a given forage cultivar. The objectives of this study were to compare calibration and ensemble approaches in order to minimize the uncertainty of crop yield estimates using the SIMPLE crop model. Cultivar parameters were calibrated using Log-likelihood (LL) and Generic Composite Similarity Measure (GCSM) as an objective function for Metropolis-Hastings (MH) algorithm. In total, 20 sets of cultivar parameters were generated for each method. Two types of ensemble approach. First type of ensemble approach was the average of model outputs (Eem), using individual parameters. The second ensemble approach was model output (Epm) of cultivar parameter obtained by averaging given 20 sets of parameters. Comparison was done for each cultivar and for each error calculation methods. 'Jowoo' and 'Yeongwoo', which are forage rice cultivars used in Korea, were subject to the parameter calibration. Yield data were obtained from experiment fields at Suwon, Jeonju, Naju and I ksan. Data for 2013, 2014 and 2016 were used for parameter calibration. For validation, yield data reported from 2016 to 2018 at Suwon was used. Initial calibration indicated that genetic coefficients obtained by LL were distributed in a narrower range than coefficients obtained by GCSM. A two-sample t-test was performed to compare between different methods of ensemble approaches and no significant difference was found between them. Uncertainty of GCSM can be neutralized by adjusting the acceptance probability. The other ensemble method (Epm) indicates that the uncertainty can be reduced with less computation using ensemble approach.

Predicting the Baltic Dry Bulk Freight Index Using an Ensemble Neural Network Model (통합적인 인공 신경망 모델을 이용한 발틱운임지수 예측)

  • SU MIAO
    • Korea Trade Review
    • /
    • v.48 no.2
    • /
    • pp.27-43
    • /
    • 2023
  • The maritime industry is playing an increasingly vital part in global economic expansion. Specifically, the Baltic Dry Index is highly correlated with global commodity prices. Hence, the importance of BDI prediction research increases. But, since the global situation has become more volatile, it has become methodologically more difficult to predict the BDI accurately. This paper proposes an integrated machine-learning strategy for accurately forecasting BDI trends. This study combines the benefits of a convolutional neural network (CNN) and long short-term memory neural network (LSTM) for research on prediction. We collected daily BDI data for over 27 years for model fitting. The research findings indicate that CNN successfully extracts BDI data features. On this basis, LSTM predicts BDI accurately. Model R2 attains 94.7 percent. Our research offers a novel, machine-learning-integrated approach to the field of shipping economic indicators research. In addition, this study provides a foundation for risk management decision-making in the fields of shipping institutions and financial investment.

Crop Yield Estimation Utilizing Feature Selection Based on Graph Classification (그래프 분류 기반 특징 선택을 활용한 작물 수확량 예측)

  • Ohnmar Khin;Sung-Keun Lee
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.6
    • /
    • pp.1269-1276
    • /
    • 2023
  • Crop estimation is essential for the multinational meal and powerful demand due to its numerous aspects like soil, rain, climate, atmosphere, and their relations. The consequence of climate shift impacts the farming yield products. We operate the dataset with temperature, rainfall, humidity, etc. The current research focuses on feature selection with multifarious classifiers to assist farmers and agriculturalists. The crop yield estimation utilizing the feature selection approach is 96% accuracy. Feature selection affects a machine learning model's performance. Additionally, the performance of the current graph classifier accepts 81.5%. Eventually, the random forest regressor without feature selections owns 78% accuracy and the decision tree regressor without feature selections retains 67% accuracy. Our research merit is to reveal the experimental results of with and without feature selection significance for the proposed ten algorithms. These findings support learners and students in choosing the appropriate models for crop classification studies.

Corporate Bankruptcy Prediction Model using Explainable AI-based Feature Selection (설명가능 AI 기반의 변수선정을 이용한 기업부실예측모형)

  • Gundoo Moon;Kyoung-jae Kim
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.2
    • /
    • pp.241-265
    • /
    • 2023
  • A corporate insolvency prediction model serves as a vital tool for objectively monitoring the financial condition of companies. It enables timely warnings, facilitates responsive actions, and supports the formulation of effective management strategies to mitigate bankruptcy risks and enhance performance. Investors and financial institutions utilize default prediction models to minimize financial losses. As the interest in utilizing artificial intelligence (AI) technology for corporate insolvency prediction grows, extensive research has been conducted in this domain. However, there is an increasing demand for explainable AI models in corporate insolvency prediction, emphasizing interpretability and reliability. The SHAP (SHapley Additive exPlanations) technique has gained significant popularity and has demonstrated strong performance in various applications. Nonetheless, it has limitations such as computational cost, processing time, and scalability concerns based on the number of variables. This study introduces a novel approach to variable selection that reduces the number of variables by averaging SHAP values from bootstrapped data subsets instead of using the entire dataset. This technique aims to improve computational efficiency while maintaining excellent predictive performance. To obtain classification results, we aim to train random forest, XGBoost, and C5.0 models using carefully selected variables with high interpretability. The classification accuracy of the ensemble model, generated through soft voting as the goal of high-performance model design, is compared with the individual models. The study leverages data from 1,698 Korean light industrial companies and employs bootstrapping to create distinct data groups. Logistic Regression is employed to calculate SHAP values for each data group, and their averages are computed to derive the final SHAP values. The proposed model enhances interpretability and aims to achieve superior predictive performance.