• Title/Summary/Keyword: Ensemble models

Search Result 365, Processing Time 0.028 seconds

Comparative analysis of model performance for predicting the customer of cafeteria using unstructured data

  • Seungsik Kim;Nami Gu;Jeongin Moon;Keunwook Kim;Yeongeun Hwang;Kyeongjun Lee
    • Communications for Statistical Applications and Methods
    • /
    • v.30 no.5
    • /
    • pp.485-499
    • /
    • 2023
  • This study aimed to predict the number of meals served in a group cafeteria using machine learning methodology. Features of the menu were created through the Word2Vec methodology and clustering, and a stacking ensemble model was constructed using Random Forest, Gradient Boosting, and CatBoost as sub-models. Results showed that CatBoost had the best performance with the ensemble model showing an 8% improvement in performance. The study also found that the date variable had the greatest influence on the number of diners in a cafeteria, followed by menu characteristics and other variables. The implications of the study include the potential for machine learning methodology to improve predictive performance and reduce food waste, as well as the removal of subjective elements in menu classification. Limitations of the research include limited data cases and a weak model structure when new menus or foreign words are not included in the learning data. Future studies should aim to address these limitations.

Enhancing Autonomous Vehicle RADAR Performance Prediction Model Using Stacking Ensemble (머신러닝 스태킹 앙상블을 이용한 자율주행 자동차 RADAR 성능 향상)

  • Si-yeon Jang;Hye-lim Choi;Yun-ju Oh
    • Journal of Internet Computing and Services
    • /
    • v.25 no.2
    • /
    • pp.21-28
    • /
    • 2024
  • Radar is an essential sensor component in autonomous vehicles, and the market for radar applications in this context is steadily expanding with a growing variety of products. In this study, we aimed to enhance the stability and performance of radar systems by developing and evaluating a radar performance prediction model that can predict radar defects. We selected seven machine learning and deep learning algorithms and trained the model with a total of 49 input data types. Ultimately, when we employed an ensemble of 17 models, it exhibited the highest performance. We anticipate that these research findings will assist in predicting product defects at the production stage, thereby maximizing production yield and minimizing the costs associated with defective products.

Prediction of Residual Resistance Coefficient of Low-Speed Full Ships Using Hull Form Variables and Machine Learning Approaches (선형변수 기계학습 기법을 활용한 저속비대선의 잉여저항계수 추정)

  • Kim, Yoo-Chul;Yang, Kyung-Kyu;Kim, Myung-Soo;Lee, Young-Yeon;Kim, Kwang-Soo
    • Journal of the Society of Naval Architects of Korea
    • /
    • v.57 no.6
    • /
    • pp.312-321
    • /
    • 2020
  • In this study, machine learning techniques were applied to predict the residual resistance coefficient (Cr) of low-speed full ships. The used machine learning methods are Ridge regression, support vector regression, random forest, neural network and their ensemble model. 19 hull form variables were used as input variables for machine learning methods. The hull form variables and Cr data obtained from 139 hull forms of KRISO database were used in analysis. 80 % of the total data were used as training models and the rest as validation. Some non-linear models showed the overfitted results and the ensemble model showed better results than others.

A Study on the Regional Economic Multiplier Impacts of Jeju International Wind Ensemble Festival (제주국제관악제의 지역경제파급효과 분석에 관한 연구)

  • Ko, Hye-young;Yang, Jeong-Cheol;Lim, Jung-Hyun;Hwang, Kyung-Soo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.20 no.9
    • /
    • pp.323-332
    • /
    • 2019
  • The purpose of this study is to measure the effects on the regional economy from the Jeju International Wind Ensemble Festival. In order to examine the economic ripple effects of the festival, we examine its impact on the local economy using two regional (Jeju-National) industry-related models based on the 2013 Jeju Region Input and Output Table. We also compare how the Jeju International Wind Ensemble Festival is growing and affecting the regional economy through a comparison between 2017 and 2018. Comparing the results of a production-inducing and value added-effect analysis of the induced industries from investment expenditures for the Jeju International Wind Ensemble Festival, the production-inducing effects increased by 2.1 times-from 9.05 billion won in 2017 to 18.7 billion won in 2018. The value-added effect increased by 2.2 times, from nearly 4.3 billion won in 2017 to nearly 9.2 billion won in 2018. The analysis shows that the Jeju International Wind Ensemble Festival contributes greatly to an income increase for local residents. In order to enhance the effects of the Jeju International Wind Ensemble Festival, it is necessary to use policies that link culture and tourism in Jeju.

Characteristics of Signal-to-Noise Paradox and Limits of Potential Predictive Skill in the KMA's Climate Prediction System (GloSea) through Ensemble Expansion (기상청 기후예측시스템(GloSea)의 앙상블 확대를 통해 살펴본 신호대잡음의 역설적 특징(Signal-to-Noise Paradox)과 예측 스킬의 한계)

  • Yu-Kyung Hyun;Yeon-Hee Park;Johan Lee;Hee-Sook Ji;Kyung-On Boo
    • Atmosphere
    • /
    • v.34 no.1
    • /
    • pp.55-67
    • /
    • 2024
  • This paper aims to provide a detailed introduction to the concept of the Ratio of Predictable Component (RPC) and the Signal-to-Noise Paradox. Then, we derive insights from them by exploring the paradoxical features by conducting a seasonal and regional analysis through ensemble expansion in KMA's climate prediction system (GloSea). We also provide an explanation of the ensemble generation method, with a specific focus on stochastic physics. Through this study, we can provide the predictability limits of our forecasting system, and find way to enhance it. On a global scale, RPC reaches a value of 1 when the ensemble is expanded to a maximum of 56 members, underlining the significance of ensemble expansion in the climate prediction system. The feature indicating RPC paradoxically exceeding 1 becomes particularly evident in the winter North Atlantic and the summer North Pacific. In the Siberian Continent, predictability is notably low, persisting even as the ensemble size increases. This region, characterized by a low RPC, is considered challenging for making reliable predictions, highlighting the need for further improvement in the model and initialization processes related to land processes. In contrast, the tropical ocean demonstrates robust predictability while maintaining an RPC of 1. Through this study, we have brought to attention the limitations of potential predictability within the climate prediction system, emphasizing the necessity of leveraging predictable signals with high RPC values. We also underscore the importance of continuous efforts aimed at improving models and initializations to overcome these limitations.

Future Change Using the CMIP5 MME and Best Models: II. The Thermodynamic and Dynamic Analysis on Near and Long-Term Future Climate Change over East Asia (CMIP5 MME와 Best 모델의 비교를 통해 살펴본 미래전망: II. 동아시아 단·장기 미래기후전망에 대한 열역학적 및 역학적 분석)

  • Kim, Byeong-Hee;Moon, Hyejin;Ha, Kyung-Ja
    • Atmosphere
    • /
    • v.25 no.2
    • /
    • pp.249-260
    • /
    • 2015
  • The changes in thermodynamic and dynamic aspects on near (2025~2049) and long-term (2075~2099) future climate changes between the historical run (1979~2005) and the Representative Concentration Pathway (RCP) 4.5 run with 20 coupled models which employed in the phase five of Coupled Model Inter-comparison Project (CMIP5) over East Asia (EA) and the Korean Peninsula are investigated as an extended study for Moon et al. (2014) study noted that the 20 models' multi-model ensemble (MME) and best five models' multi-model ensemble (B5MME) have a different increasing trend of precipitation during the boreal winter and summer, in spite of a similar increasing trend of surface air temperature, especially over the Korean Peninsula. Comparing the MME and B5MME, the dynamic factor (the convergence of mean moisture by anomalous wind) and the thermodynamic factor (the convergence of anomalous moisture by mean wind) in terms of moisture flux convergence are analyzed. As a result, the dynamic factor causes the lower increasing trend of precipitation in B5MME than the MME during the boreal winter and summer over EA. However, over the Korean Peninsula, the dynamic factor causes the lower increasing trend of precipitation in B5MME than the MME during the boreal winter, whereas the thermodynamic factor causes the higher increasing trend of precipitation in B5MME than the MME during the boreal summer. Therefore, it can be noted that the difference between MME and B5MME on the change in precipitation is affected by dynamic (thermodynamic) factor during the boreal winter (summer) over the Korean Peninsula.

Climate Change Impact Assessment of Abies nephrolepis (Trautv.) Maxim. in Subalpine Ecosystem using Ensemble Habitat Suitability Modeling (서식처 적합모형을 적용한 고산지역 분비나무의 기후변화 영향평가)

  • Choi, Jae-Yong;Lee, Sang-Hyuk
    • Journal of the Korean Society of Environmental Restoration Technology
    • /
    • v.21 no.1
    • /
    • pp.103-118
    • /
    • 2018
  • Ecosystems in subalpine regions are recognized as areas vulnerable to climatic changes because rainfall and the possibility of flora migration are very low due to the characteristics of topography in the regions. In this context, habitat niche was formulated for representative species of arbors in subalpine regions in order to understand the effects of climatic changes on alpine arbor ecosystems. The current potential habitats were modeled as future change areas according to the climatic change scenarios. Based on the growth conditions and environmental characteristics of the habitats, the study was conducted to identify direct and indirect causes affecting the habitat reduction of Abies nephrolepis. Diverse model algorithms for explanation of the relationship between the emergence of biological species and habitat environments were reviewed to construct the environmental data suitable for the six models(GLM, GAM, RF, MaxEnt, ANN, and SVM). Weights determined through TSS were applied to the six models for ensemble in an attempt to minimize the uncertainty of the models. Based on the current climate determined by averaging the climates over the past 30years(1981~2010) and the HadGEM-RA model was applied to fabricate bioclimatic variables for scenarios RCP 4.5 and 8.5 on the near and far future. The results of models of the alpine region tree species studied were put together and evaluated and the results indicated that a total of eight national parks such as Mt. Seorak, Odaesan, and Hallasan would be mainly affected by climatic changes. Changes in the Baekdudaegan reserves were analyzed and in the results, A. nephrolepis was predicted to be affected the most in the RCP8.5. The results of analysis as such are expected to be finally utilizable in the survey of biological species in the Korean peninsula, restoration and conservation strategies considering climatic changes as the analysis identified the degrees of impacts of climatic changes on subalpine region trees in Korean peninsula with very high conservation values.

Validations of Typhoon Intensity Guidance Models in the Western North Pacific (북서태평양 태풍 강도 가이던스 모델 성능평가)

  • Oh, You-Jung;Moon, Il-Ju;Kim, Sung-Hun;Lee, Woojeong;Kang, KiRyong
    • Atmosphere
    • /
    • v.26 no.1
    • /
    • pp.1-18
    • /
    • 2016
  • Eleven Tropical Cyclone (TC) intensity guidance models in the western North Pacific have been validated over 2008~2014 based on various analysis methods according to the lead time of forecast, year, month, intensity, rapid intensity change, track, and geographical area with an additional focus on TCs that influenced the Korean peninsula. From the evaluation using mean absolute error and correlation coefficients for maximum wind speed forecasts up to 72 h, we found that the Hurricane Weather Research and Forecasting model (HWRF) outperforms all others overall although the Global Forecast System (GFS), the Typhoon Ensemble Prediction System of Japan Meteorological Agency (TEPS), and the Korean version of Weather and Weather Research and Forecasting model (KWRF) also shows a good performance in some lead times of forecast. In particular, HWRF shows the highest performance in predicting the intensity of strong TCs above Category 3, which may be attributed to its highest spatial resolution (~3 km). The Navy Operational Global Prediction Model (NOGAPS) and GFS were the most improved model during 2008~2014. For initial intensity error, two Japanese models, Japan Meteorological Agency Global Spectral Model (JGSM) and TEPS, had the smallest error. In track forecast, the European Centre for Medium-Range Weather Forecasts (ECMWF) and recent GFS model outperformed others. The present results has significant implications for providing basic information for operational forecasters as well as developing ensemble or consensus prediction systems.

Incremental Ensemble Learning for The Combination of Multiple Models of Locally Weighted Regression Using Genetic Algorithm (유전 알고리즘을 이용한 국소가중회귀의 다중모델 결합을 위한 점진적 앙상블 학습)

  • Kim, Sang Hun;Chung, Byung Hee;Lee, Gun Ho
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.9
    • /
    • pp.351-360
    • /
    • 2018
  • The LWR (Locally Weighted Regression) model, which is traditionally a lazy learning model, is designed to obtain the solution of the prediction according to the input variable, the query point, and it is a kind of the regression equation in the short interval obtained as a result of the learning that gives a higher weight value closer to the query point. We study on an incremental ensemble learning approach for LWR, a form of lazy learning and memory-based learning. The proposed incremental ensemble learning method of LWR is to sequentially generate and integrate LWR models over time using a genetic algorithm to obtain a solution of a specific query point. The weaknesses of existing LWR models are that multiple LWR models can be generated based on the indicator function and data sample selection, and the quality of the predictions can also vary depending on this model. However, no research has been conducted to solve the problem of selection or combination of multiple LWR models. In this study, after generating the initial LWR model according to the indicator function and the sample data set, we iterate evolution learning process to obtain the proper indicator function and assess the LWR models applied to the other sample data sets to overcome the data set bias. We adopt Eager learning method to generate and store LWR model gradually when data is generated for all sections. In order to obtain a prediction solution at a specific point in time, an LWR model is generated based on newly generated data within a predetermined interval and then combined with existing LWR models in a section using a genetic algorithm. The proposed method shows better results than the method of selecting multiple LWR models using the simple average method. The results of this study are compared with the predicted results using multiple regression analysis by applying the real data such as the amount of traffic per hour in a specific area and hourly sales of a resting place of the highway, etc.

Species Dependence of Neurofilament Structures: Monte Carlo Simulation studies of Residue-Based Neurofilament Models

  • Kim, Seon-Ok
    • Proceeding of EDISON Challenge
    • /
    • 2014.03a
    • /
    • pp.225-235
    • /
    • 2014
  • 6종의 Intermediate filament 중 type IV인 Neurofilaments (NFs)는 신경세포에 존재하는 세포골격세사로 heavy NF(NF-H), medium NF(NF-M), light NF(NF-L) 세가지의 분자 질량 단백질로 구성되어 있다. NF의 side arm은 interfilament spacing과 axonal caliber를 조절하는 중요한 역할을 한다고 생각되어왔다. 또한 이에 대해서 각각의 protein의 역할은 알아내기 위해 isolated NF의 형태와 구조에 대해 많은 연구가 이루어졌는데, NF의 구조적 특성은 NF sidearm의 tail 부분에서 phosphorylation의 정도에 따른 Lys-Ser-Pro(KSP) repeats의 charge distribution을 통해 알 수 있다. 지금까지 NF에 대한 많은 연구가 이루어졌지만 인간에 한해서만 진행되었다. 그렇기 때문에 본 연구에서는 주어진 amino acid sequence와 각 species의 NF-H:NF-M:NF-L의 비율의 정보를 이용하여 The constant-NVT ensemble MC simulation을 통해 인간뿐만이 아닌 다른 species에 대한 NF의 구조적 특성을 알아보고자 한다.

  • PDF