• Title/Summary/Keyword: 머신러닝 앙상블

Search Result 71, Processing Time 0.032 seconds

Development of Type 2 Prediction Prediction Based on Big Data (빅데이터 기반 2형 당뇨 예측 알고리즘 개발)

  • Hyun Sim;HyunWook Kim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.5
    • /
    • pp.999-1008
    • /
    • 2023
  • Early prediction of chronic diseases such as diabetes is an important issue, and improving the accuracy of diabetes prediction is especially important. Various machine learning and deep learning-based methodologies are being introduced for diabetes prediction, but these technologies require large amounts of data for better performance than other methodologies, and the learning cost is high due to complex data models. In this study, we aim to verify the claim that DNN using the pima dataset and k-fold cross-validation reduces the efficiency of diabetes diagnosis models. Machine learning classification methods such as decision trees, SVM, random forests, logistic regression, KNN, and various ensemble techniques were used to determine which algorithm produces the best prediction results. After training and testing all classification models, the proposed system provided the best results on XGBoost classifier with ADASYN method, with accuracy of 81%, F1 coefficient of 0.81, and AUC of 0.84. Additionally, a domain adaptation method was implemented to demonstrate the versatility of the proposed system. An explainable AI approach using the LIME and SHAP frameworks was implemented to understand how the model predicts the final outcome.

Modeling and Selecting Optimal Features for Machine Learning Based Detections of Android Malwares (머신러닝 기반 안드로이드 모바일 악성 앱의 최적 특징점 선정 및 모델링 방안 제안)

  • Lee, Kye Woong;Oh, Seung Taek;Yoon, Young
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.11
    • /
    • pp.427-432
    • /
    • 2019
  • In this paper, we propose three approaches to modeling Android malware. The first method involves human security experts for meticulously selecting feature sets. With the second approach, we choose 300 features with the highest importance among the top 99% features in terms of occurrence rate. The third approach is to combine multiple models and identify malware through weighted voting. In addition, we applied a novel method of eliminating permission information which used to be regarded as a critical factor for distinguishing malware. With our carefully generated feature sets and the weighted voting by the ensemble algorithm, we were able to reach the highest malware detection accuracy of 97.8%. We also verified that discarding the permission information lead to the improvement in terms of false positive and false negative rates.

Technology of Lessons Learned Analysis using Artificial intelligence: Focused on the 'L2-OODA Ensemble Algorithm' (인공지능형 전훈분석기술: 'L2-OODA 앙상블 알고리즘'을 중심으로)

  • Yang, Seong-sil;Shin, Jin
    • Convergence Security Journal
    • /
    • v.21 no.2
    • /
    • pp.67-79
    • /
    • 2021
  • Lessons Learned(LL) is a military term defined as all activities that promote future development by finding problems and need improvement in education and reality in the field of warfare development. In this paper, we focus on presenting actual examples and applying AI analysis inference techniques to solve revealed problems in promoting LL activities, such as long-term analysis, budget problems, and necessary expertise. AI legal advice services using cognitive computing-related technologies that have already been practical and in use, were judged to be the best examples to solve the problems of LL. This paper presents intelligent LL inference techniques, which utilize AI. To this end, we want to explore theoretical backgrounds such as LL analysis definitions and examples, evolution of AI into Machine Learning, cognitive computing, and apply it to new technologies in the defense sector using the newly proposed L2-OODA ensemble algorithm to contribute to implementing existing power improvement and optimization.

Development of an Ensemble Prediction Model for Lateral Deformation of Retaining Wall Under Construction (시공 중 흙막이 벽체 수평변위 예측을 위한 앙상블 모델 개발)

  • Seo, Seunghwan;Chung, Moonkyung
    • Journal of the Korean Geotechnical Society
    • /
    • v.39 no.4
    • /
    • pp.5-17
    • /
    • 2023
  • The advancement in large-scale underground excavation in urban areas necessitates monitoring and predicting technologies that can pre-emptively mitigate risk factors at construction sites. Traditionally, two methods predict the deformation of retaining walls induced by excavation: empirical and numerical analysis. Recent progress in artificial intelligence technology has led to the development of a predictive model using machine learning techniques. This study developed a model for predicting the deformation of a retaining wall under construction using a boosting-based algorithm and an ensemble model with outstanding predictive power and efficiency. A database was established using the data from the design-construction-maintenance process of the underground retaining wall project in a manifold manner. Based on these data, a learning model was created, and the performance was evaluated. The boosting and ensemble models demonstrated that wall deformation could be accurately predicted. In addition, it was confirmed that prediction results with the characteristics of the actual construction process can be presented using data collected from ground measurements. The predictive model developed in this study is expected to be used to evaluate and monitor the stability of retaining walls under construction.

A Study on the Hydrological Quantitative Precipitation Forecast(HQPF) based on Machine Learning for Rainfall Impact Forecasting (호우 영향예보를 위한 머신러닝 기반의 수문학적 정량강우예측(HQPF) 연구)

  • Choo, Kyung-Su;Shin, Yoon-Hu;Kim, Sung-Min;Jee, Yongkeun;Lee, Young-Mi;Kang, Dong-Ho;Kim, Byung-Sik
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2022.05a
    • /
    • pp.63-63
    • /
    • 2022
  • 기상 예보자료는 발생 가능한 재난의 예방 및 대비 차원에서 매우 중요한 자료로 활용되고 있다. 우리나라 기상청에서는 동네예보를 통해 5km 공간해상도의 1시간 간격 초단기예보와, 6시간 간격 정량강우예보(Quantitative Precipitation Forecast, QPF)의 단기예보 정보를 제공하고 있다. 그러나 이와 같은 예보자료는 강우량의 시·공간변화가 큰 집중호우와 같은 기상자료를 활용한 수문학적인 해석에는 한계가 있다. 예보자료를 수문학에 활용하기 위한 시·공간적 해상도 개선뿐만 아니라 방대한 기상 및 기후 자료의 예측성능을 개선하기 위한 다양한 연구가 진행되고 있다. 본 연구에서는 기상청이 제공하는 지역 앙상블 예측 시스템(Local ENsemble prediction System, LENS)와 종관기상관측시스템(ASOS) 및 방재기상관측시스템(AWS) 관측 데이터 및 동네예보에 기계학습 방법을 적용하여 수문학적 정량적 강수량 예측(Hydrological Quantitative Precipitation Forecast, HQPF) 정보를 생산하였다. 전처리 과정을 통해 모든 데이터의 시간해상도와 공간해상도를 동일한 해상도로 변환하였으며, 예측 변수의 인자 분석을 통해 기계학습의 예측 변수를 도출하였다. 기계학습 방법으로는 처리속도와 확장성을 고려하여 XGBoost(eXtreme Gradient Boosting) 방식을 적용하였으며, 집중호우에서의 예측정확도를 높이기 위해 확률매칭(PM) 방식을 적용하였다. 생산된 HQPF의 성능을 평가하기 위해 2020년에 발생한 14건의 호우 사상을 대상으로 태풍형과 비태풍형으로 구분하여 검증을 수행하였다.

  • PDF

Detecting Adversarial Example Using Ensemble Method on Deep Neural Network (딥뉴럴네트워크에서의 적대적 샘플에 관한 앙상블 방어 연구)

  • Kwon, Hyun;Yoon, Joonhyeok;Kim, Junseob;Park, Sangjun;Kim, Yongchul
    • Convergence Security Journal
    • /
    • v.21 no.2
    • /
    • pp.57-66
    • /
    • 2021
  • Deep neural networks (DNNs) provide excellent performance for image, speech, and pattern recognition. However, DNNs sometimes misrecognize certain adversarial examples. An adversarial example is a sample that adds optimized noise to the original data, which makes the DNN erroneously misclassified, although there is nothing wrong with the human eye. Therefore studies on defense against adversarial example attacks are required. In this paper, we have experimentally analyzed the success rate of detection for adversarial examples by adjusting various parameters. The performance of the ensemble defense method was analyzed using fast gradient sign method, DeepFool method, Carlini & Wanger method, which are adversarial example attack methods. Moreover, we used MNIST as experimental data and Tensorflow as a machine learning library. As an experimental method, we carried out performance analysis based on three adversarial example attack methods, threshold, number of models, and random noise. As a result, when there were 7 models and a threshold of 1, the detection rate for adversarial example is 98.3%, and the accuracy of 99.2% of the original sample is maintained.

Modeling and Selecting Optimal Features for Machine Learning Based Detections of Android Malwares (머신러닝 기반 악성 안드로이드 모바일 앱의 최적특징점 선정 및 모델링 방안 제안)

  • Lee, Kye Woong;Oh, Seung Taek;Yoon, Young
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.05a
    • /
    • pp.164-167
    • /
    • 2019
  • 모바일 운영체제 중 안드로이드의 점유율이 높아지면서 모바일 악성코드 위협은 대부분 안드로이드에서 발생하고 있다. 그러나 정상앱이나 악성앱이 진화하면서 권한 등의 단일 특징점으로 악성여부를 연구하는 방법은 유효성 문제가 발생하여 본 논문에서는 다양한 특징점 추출 및 기계학습을 활용하여 극복하고자 한다. 본 논문에서는 APK 파일에서 구동에 필요한 다섯 종류의 특징점들을 안드로가드라는 정적분석 툴을 통해 학습데이터의 특성을 추출한다. 또한 추출된 중요 특징점을 기반으로 모델링을 하는 세 가지 방법을 제시한다. 첫 번째 방법은 보안 전문가에 의해 엄선된 132가지의 특징점 조합을 바탕으로 모델링하는 것이다. 두 번째는 학습 데이터 7,000개의 앱에서 발생 빈도수가 높은 상위 99%인 8,004가지의 특징점들 중 랜덤포레스트 분류기를 이용하여 특성중요도가 가장 높은 300가지를 선정 후 모델링 하는 방법이다. 마지막 방법은 300가지의 특징점을 학습한 다수의 모델을 통합하여 하나의 가중치 투표 모델을 구성하는 방법이다. 최종적으로 가중치 투표 모델인 앙상블 알고리즘 모델을 사용하여 97퍼센트로 정확도가 개선되었고 오탐률도 1.6%로 성능이 개선되었다.

The Effect of Input Variables Clustering on the Characteristics of Ensemble Machine Learning Model for Water Quality Prediction (입력자료 군집화에 따른 앙상블 머신러닝 모형의 수질예측 특성 연구)

  • Park, Jungsu
    • Journal of Korean Society on Water Environment
    • /
    • v.37 no.5
    • /
    • pp.335-343
    • /
    • 2021
  • Water quality prediction is essential for the proper management of water supply systems. Increased suspended sediment concentration (SSC) has various effects on water supply systems such as increased treatment cost and consequently, there have been various efforts to develop a model for predicting SSC. However, SSC is affected by both the natural and anthropogenic environment, making it challenging to predict SSC. Recently, advanced machine learning models have increasingly been used for water quality prediction. This study developed an ensemble machine learning model to predict SSC using the XGBoost (XGB) algorithm. The observed discharge (Q) and SSC in two fields monitoring stations were used to develop the model. The input variables were clustered in two groups with low and high ranges of Q using the k-means clustering algorithm. Then each group of data was separately used to optimize XGB (Model 1). The model performance was compared with that of the XGB model using the entire data (Model 2). The models were evaluated by mean squared error-ob servation standard deviation ratio (RSR) and root mean squared error. The RSR were 0.51 and 0.57 in the two monitoring stations for Model 2, respectively, while the model performance improved to RSR 0.46 and 0.55, respectively, for Model 1.

Discerning the intensity of precipitation through acoustic and vibrational analysis of rainfall via XGBoost algorithm (XGBoost 알고리즘을 활용한 강우의 음향 및 진동 분석 기반의 강우강도 산정)

  • Seunghyun Hwang;Jinwook Lee;Hyeon-Joon Kim;Jongyun Byun;Changhyun Jun
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.209-209
    • /
    • 2023
  • 본 연구에서는 강우 시 발생하는 음향 및 진동 신호를 기반으로 강우강도를 산정하기 위한 방법론을 제안하였다. 먼저, Raspberry Pi, 콘덴서 마이크 및 가속도 센서로 구성된 관측 기기로부터 실제 비가 내리는 환경에서의 음향 및 진동 신호를 수집하였다. 가속도 센서로부터 계측된 진동 신호를 활용하여 강우 유무에 대한 이진 분류를 수행하고, 강우가 발생한 것으로 판단된 기간에 해당하는 음향 신호에 Short-Time Fourier Transform 기술을 적용하여 주파수 영역에서 나타나는 magnitude의 평균과 표준 편차, 최고 주파수 등의 특징을 기반으로 강우강도를 산정하였다. 이를 위해 앙상블 기반의 머신러닝 학습 모델인 XGBoost 알고리즘을 사용하였으며, 광학 우적계를 통해 관측한 강우강도와 산정 결과를 비교·평가하였다. 강우강도 산정 과정에서 사용된 음향 신호의 길이를 1초, 10초, 1분으로 구분하였으며, 무강우 기간 내 음향 정보로부터 배경 음향에 의한 노이즈를 제거하고자 하였다. 최종적으로 강우 유무 이진 분류 과정의 선행 여부, 음향 신호의 길이 및 노이즈 제거 방법에 따른 강우강도 산정 결과들에 대한 성능 비교를 통해 본 연구에서 제안하고자 하는 방법론의 실효성을 평가하였다.

  • PDF

Simulation for Power Efficiency Optimization of Air Compressor Using Machine Learning Ensemble (머신러닝 앙상블을 활용한 공압기의 전력 효율 최적화 시뮬레이션 )

  • Juhyeon Kim;Moonsoo Jang;Jieun Choi;Yoseob Heo;Hyunsang Chung;Soyoung Park
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.26 no.6_3
    • /
    • pp.1205-1213
    • /
    • 2023
  • This study delves into methods for enhancing the power efficiency of air compressor systems, with the primary objective of significantly impacting industrial energy consumption and environmental preservation. The paper scrutinizes Shinhan Airro Co., Ltd.'s power efficiency optimization technology and employs machine learning ensemble models to simulate power efficiency optimization. The results indicate that Shinhan Airro's optimization system led to a notable 23.5% increase in power efficiency. Nonetheless, the study's simulations, utilizing machine learning ensemble techniques, reveal the potential for a further 51.3% increase in power efficiency. By continually exploring and advancing these methodologies, this research introduces a practical approach for identifying optimization points through data-driven simulations using machine learning ensembles.