• Title/Summary/Keyword: 머신러닝 앙상블

Search Result 72, Processing Time 0.026 seconds

Simulation for Power Efficiency Optimization of Air Compressor Using Machine Learning Ensemble (머신러닝 앙상블을 활용한 공압기의 전력 효율 최적화 시뮬레이션 )

  • Juhyeon Kim;Moonsoo Jang;Jieun Choi;Yoseob Heo;Hyunsang Chung;Soyoung Park
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.26 no.6_3
    • /
    • pp.1205-1213
    • /
    • 2023
  • This study delves into methods for enhancing the power efficiency of air compressor systems, with the primary objective of significantly impacting industrial energy consumption and environmental preservation. The paper scrutinizes Shinhan Airro Co., Ltd.'s power efficiency optimization technology and employs machine learning ensemble models to simulate power efficiency optimization. The results indicate that Shinhan Airro's optimization system led to a notable 23.5% increase in power efficiency. Nonetheless, the study's simulations, utilizing machine learning ensemble techniques, reveal the potential for a further 51.3% increase in power efficiency. By continually exploring and advancing these methodologies, this research introduces a practical approach for identifying optimization points through data-driven simulations using machine learning ensembles.

Parallel Network Model of Abnormal Respiratory Sound Classification with Stacking Ensemble

  • Nam, Myung-woo;Choi, Young-Jin;Choi, Hoe-Ryeon;Lee, Hong-Chul
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.11
    • /
    • pp.21-31
    • /
    • 2021
  • As the COVID-19 pandemic rapidly changes healthcare around the globe, the need for smart healthcare that allows for remote diagnosis is increasing. The current classification of respiratory diseases cost high and requires a face-to-face visit with a skilled medical professional, thus the pandemic significantly hinders monitoring and early diagnosis. Therefore, the ability to accurately classify and diagnose respiratory sound using deep learning-based AI models is essential to modern medicine as a remote alternative to the current stethoscope. In this study, we propose a deep learning-based respiratory sound classification model using data collected from medical experts. The sound data were preprocessed with BandPassFilter, and the relevant respiratory audio features were extracted with Log-Mel Spectrogram and Mel Frequency Cepstral Coefficient (MFCC). Subsequently, a Parallel CNN network model was trained on these two inputs using stacking ensemble techniques combined with various machine learning classifiers to efficiently classify and detect abnormal respiratory sounds with high accuracy. The model proposed in this paper classified abnormal respiratory sounds with an accuracy of 96.9%, which is approximately 6.1% higher than the classification accuracy of baseline model.

Development of an Ensemble-Based Multi-Region Integrated Odor Concentration Prediction Model (앙상블 기반의 악취 농도 다지역 통합 예측 모델 개발)

  • Seong-Ju Cho;Woo-seok Choi;Sang-hyun Choi
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.3
    • /
    • pp.383-400
    • /
    • 2023
  • Air pollution-related diseases are escalating worldwide, with the World Health Organization (WHO) estimating approximately 7 million annual deaths in 2022. The rapid expansion of industrial facilities, increased emissions from various sources, and uncontrolled release of odorous substances have brought air pollution to the forefront of societal concerns. In South Korea, odor is categorized as an independent environmental pollutant, alongside air and water pollution, directly impacting the health of local residents by causing discomfort and aversion. However, the current odor management system in Korea remains inadequate, necessitating improvements. This study aims to enhance the odor management system by analyzing 1,010,749 data points collected from odor sensors located in Osong, Chungcheongbuk-do, using an Ensemble-Based Multi-Region Integrated Odor Concentration Prediction Model. The research results demonstrate that the model based on the XGBoost algorithm exhibited superior performance, with an RMSE of 0.0096, significantly outperforming the single-region model (0.0146) with a 51.9% reduction in mean error size. This underscores the potential for increasing data volume, improving accuracy, and enabling odor prediction in diverse regions using a unified model through the standardization of odor concentration data collected from various regions.

An Improvement Study on the Hydrological Quantitative Precipitation Forecast (HQPF) for Rainfall Impact Forecasting (호우 영향예보를 위한 수문학적 정량강우예측(HQPF) 개선 연구)

  • Yoon Hu Shin;Sung Min Kim;Yong Keun Jee;Young-Mi Lee;Byung-Sik Kim
    • Journal of Korean Society of Disaster and Security
    • /
    • v.15 no.4
    • /
    • pp.87-98
    • /
    • 2022
  • In recent years, frequent localized heavy rainfalls, which have a lot of rainfall in a short period of time, have been increasingly causing flooding damages. To prevent damage caused by localized heavy rainfalls, Hydrological Quantitative Precipitation Forecast (HQPF) was developed using the Local ENsemble prediction System (LENS) provided by the Korea Meteorological Administration (KMA) and Machine Learning and Probability Matching (PM) techniques using Digital forecast data. HQPF is produced as information on the impact of heavy rainfall to prepare for flooding damage caused by localized heavy rainfalls, but there is a tendency to overestimate the low rainfall intensity. In this study, we improved HQPF by expanding the period of machine learning data, analyzing ensemble techniques, and changing the process of Probability Matching (PM) techniques to improve predictive accuracy and over-predictive propensity of HQPF. In order to evaluate the predictive performance of the improved HQPF, we performed the predictive performance verification on heavy rainfall cases caused by the Changma front from August 27, 2021 to September 3, 2021. We found that the improved HQPF showed a significantly improved prediction accuracy for rainfall below 10 mm, as well as the over-prediction tendency, such as predicting the likelihood of occurrence and rainfall area similar to observation.

A Study on the Prediction Models of Used Car Prices Using Ensemble Model And SHAP Value: Focus on Feature of the Vehicle Type (앙상블 모델과 SHAP Value를 활용한 국내 중고차 가격 예측 모델에 관한 연구: 차종 특성을 중심으로)

  • Seungjun Yim;Joungho Lee;Choonho Ryu
    • Journal of Service Research and Studies
    • /
    • v.14 no.1
    • /
    • pp.27-43
    • /
    • 2024
  • The market share of online platform services in the used car market continues to expand. And The used car online platform service provides service users with specifications of vehicles, accident history, inspection details, detailed options, and prices of used cars. SUV vehicle type's share in the domestic automobile market will be more than 50% in 2023, Sales of Hybrid vehicle type are doubled compared to last year. And these vehicle types are also gaining popularity in the used car market. Prior research has proposed a used car price prediction model by executing a Machine Learning model for all vehicles or vehicles by brand. On the other hand, the popularity of SUV and Hybrid vehicles in the domestic market continues to rise, but It was difficult to find a study that proposed a used car price prediction model for these vehicle type. This study selects a used car price prediction model by vehicle type using vehicle specifications and options for Sedans, SUV, and Hybrid vehicles produced by domestic brands. Accordingly, after selecting feature through the Lasso regression model, which is a feature selection, the ensemble model was sequentially executed with the same sampling, and the best model by vehicle type was selected. As a result, the best model for all models was selected as the CBR model, and the contribution and direction of the features were confirmed by visualizing Tree SHAP Value for the best model for each model. The implications of this study are expected to propose a used car price prediction model by vehicle type to sales officials using online platform services, confirm the attribution and direction of features, and help solve problems caused by asymmetry fo information between them.

Prediction of arrhythmia using multivariate time series data (다변량 시계열 자료를 이용한 부정맥 예측)

  • Lee, Minhai;Noh, Hohsuk
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.5
    • /
    • pp.671-681
    • /
    • 2019
  • Studies on predicting arrhythmia using machine learning have been actively conducted with increasing number of arrhythmia patients. Existing studies have predicted arrhythmia based on multivariate data of feature variables extracted from RR interval data at a specific time point. In this study, we consider that the pattern of the heart state changes with time can be important information for the arrhythmia prediction. Therefore, we investigate the usefulness of predicting the arrhythmia with multivariate time series data obtained by extracting and accumulating the multivariate vectors of the feature variables at various time points. When considering 1-nearest neighbor classification method and its ensemble for comparison, it is confirmed that the multivariate time series data based method can have better classification performance than the multivariate data based method if we select an appropriate time series distance function.

Comparative Analysis of Traffic Accident Severity of Two-Wheeled Vehicles Using XGBoost (XGBoost를 활용한 이륜자동차 교통사고 심각도 비교분석)

  • Kwon, Cheol woo;Chang, Hyun ho
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.20 no.4
    • /
    • pp.1-12
    • /
    • 2021
  • Emergence of the COVID 19 pandemic has resulted in a sharp increase in the number of two-wheeler vehicular traffic accidents, prompting the introduction of numerous efforts for their prevention. This study applied XGBoost to determine the factors that affect severity of two-wheeled vehicular traffic accidents, by examining data collected over the past 10 years and analyzing the influence of each factor. Among the total factors assessed, variables affecting the severity of traffic accidents were overwhelmingly high in cases of signal violations, followed by the age group of drivers (60s or older), factors pertaining only to the car, and cases of centerline infringement. Based on the research results, a reasonable legal reform plan was proposed to prevent serious traffic accidents and strengthen safety management of two-wheeled vehicles. Based on the research results, we propose a reasonable legal reform plan to prevent serious traffic accidents and strengthen safety management of two-wheeled vehicles.

Evaluation of Resilience in terms of Hydropower Reservoirs Operation with Climate Change (기후변화 시나리오에 따른 발전용댐의 운영측면 회복탄력성 평가)

  • Kim, Dong Hyun;Yu, Hyeong-Ju;Kim, Jong-Ho;Lee, Seung Oh
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2022.05a
    • /
    • pp.337-337
    • /
    • 2022
  • 한반도 기후변화평가보고서에 의하면 집중호우의 빈도와 강도는 1990년대 후반부터 꾸준히 증가하는 경향을 보였고 2020년의 홍수는 예견된 것으로 우려가 현실화 된 사건이라 볼 수 있다. 2020년 홍수에서 알 수 있듯이 강수량과 하천의 유량을 직접 담아내는 국내 댐 시설의 운영은 증가하는 기후변화의 위험에 더욱 중요한 역할을 할 것으로 보인다. 단일 목적으로 건설된 발전용댐의 경우도 다목적댐, 홍수조절댐 등 다양한 수자원시설과 동일한 수계 내에 배치되어 있기 때문에 기후변화 시나리오에 따라 발전용댐의 운영도 변화되어야 할 것이다. 2020년 발전용댐의 다목적 활용 협약 등의 여건 변화는 수자원 활용 측면에서 발전용댐의 역할이 기대되고 있다. 따라서 본 연구에서는 기후변화 시나리오에 따른 발전용댐의 운영안을 회복탄력성 관점에서 제시하고자 한다. 기후변화는 CMIP6 데이터베이스에서 제공하는 18개의 GCMs의 결과를 고려하여 기후변화를 고려하였으며 3개의 미래구간에 대해 100개의 앙상블을 생성하였다. 해당 자료는 LSTM 모형으로 기반으로 댐 유입량을 예측하기 위해 사용되었다. 유입량 예측 결과 0.77~0.89의 NSE 값을 갖는 것으로 평가되었다. 최종적으로 기후변화 시나리오 따라 증가하는 예측된 유입량에 맞춰 댐 모의운영을 수행하였고 회복탄력성, 발전량, 홍수위험 등을 평가하였다. 그 결과 전력생산 관점의 회복탄력성을 유지하는 운영안을 제시하였고, 이를 통해 전력생산량을 증가시키면서 홍수조절 및 용수공급에 기여함을 확인하였다. 향후 방류량에 따라 하류의 구체적인 치수위험평가가 동시에 이뤄진다면 기후변화 시나리오별 발전용댐의 최적 운영기준을 제시할 수 있을 것으로 기대된다.

  • PDF

Infrastructure Anomaly Analysis for Data-center Failure Prevention: Based on RRCF and Prophet Ensemble Analysis (데이터센터 장애 예방을 위한 인프라 이상징후 분석: RRCF와 Prophet Ensemble 분석 기반)

  • Hyun-Jong Kim;Sung-Keun Kim;Byoung-Whan Chun;Kyong-Bog, Jin;Seung-Jeong Yang
    • The Journal of Bigdata
    • /
    • v.7 no.1
    • /
    • pp.113-124
    • /
    • 2022
  • Various methods using machine learning and big data have been applied to prevent failures in Data Centers. However, there are many limitations to referencing individual equipment-based performance indicators or to being practically utilized as an approach that does not consider the infrastructure operating environment. In this study, the performance indicators of individual infrastructure equipment are integrated monitoring and the performance indicators of various equipment are segmented and graded to make a single numerical value. Data pre-processing based on experience in infrastructure operation. And an ensemble of RRCF (Robust Random Cut Forest) analysis and Prophet analysis model led to reliable analysis results in detecting anomalies. A failure analysis system was implemented to facilitate the use of Data Center operators. It can provide a preemptive response to Data Center failures and an appropriate tuning time.

Estimation of Chlorophyll-a Concentration in Nakdong River Using Machine Learning-Based Satellite Data and Water Quality, Hydrological, and Meteorological Factors (머신러닝 기반 위성영상과 수질·수문·기상 인자를 활용한 낙동강의 Chlorophyll-a 농도 추정)

  • Soryeon Park;Sanghun Son;Jaegu Bae;Doi Lee;Dongju Seo;Jinsoo Kim
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.5_1
    • /
    • pp.655-667
    • /
    • 2023
  • Algal bloom outbreaks are frequently reported around the world, and serious water pollution problems arise every year in Korea. It is necessary to protect the aquatic ecosystem through continuous management and rapid response. Many studies using satellite images are being conducted to estimate the concentration of chlorophyll-a (Chl-a), an indicator of algal bloom occurrence. However, machine learning models have recently been used because it is difficult to accurately calculate Chl-a due to the spectral characteristics and atmospheric correction errors that change depending on the water system. It is necessary to consider the factors affecting algal bloom as well as the satellite spectral index. Therefore, this study constructed a dataset by considering water quality, hydrological and meteorological factors, and sentinel-2 images in combination. Representative ensemble models random forest and extreme gradient boosting (XGBoost) were used to predict the concentration of Chl-a in eight weirs located on the Nakdong river over the past five years. R-squared score (R2), root mean square errors (RMSE), and mean absolute errors (MAE) were used as model evaluation indicators, and it was confirmed that R2 of XGBoost was 0.80, RMSE was 6.612, and MAE was 4.457. Shapley additive expansion analysis showed that water quality factors, suspended solids, biochemical oxygen demand, dissolved oxygen, and the band ratio using red edge bands were of high importance in both models. Various input data were confirmed to help improve model performance, and it seems that it can be applied to domestic and international algal bloom detection.