• 제목/요약/키워드: Ensemble Techniques

검색결과 183건 처리시간 0.03초

Study on Predicting the Designation of Administrative Issue in the KOSDAQ Market Based on Machine Learning Based on Financial Data (머신러닝 기반 KOSDAQ 시장의 관리종목 지정 예측 연구: 재무적 데이터를 중심으로)

  • Yoon, Yanghyun;Kim, Taekyung;Kim, Suyeong
    • Asia-Pacific Journal of Business Venturing and Entrepreneurship
    • /
    • 제17권1호
    • /
    • pp.229-249
    • /
    • 2022
  • This paper investigates machine learning models for predicting the designation of administrative issues in the KOSDAQ market through various techniques. When a company in the Korean stock market is designated as administrative issue, the market recognizes the event itself as negative information, causing losses to the company and investors. The purpose of this study is to evaluate alternative methods for developing a artificial intelligence service to examine a possibility to the designation of administrative issues early through the financial ratio of companies and to help investors manage portfolio risks. In this study, the independent variables used 21 financial ratios representing profitability, stability, activity, and growth. From 2011 to 2020, when K-IFRS was applied, financial data of companies in administrative issues and non-administrative issues stocks are sampled. Logistic regression analysis, decision tree, support vector machine, random forest, and LightGBM are used to predict the designation of administrative issues. According to the results of analysis, LightGBM with 82.73% classification accuracy is the best prediction model, and the prediction model with the lowest classification accuracy is a decision tree with 71.94% accuracy. As a result of checking the top three variables of the importance of variables in the decision tree-based learning model, the financial variables common in each model are ROE(Net profit) and Capital stock turnover ratio, which are relatively important variables in designating administrative issues. In general, it is confirmed that the learning model using the ensemble had higher predictive performance than the single learning model.

A Study on the Prediction of Rock Classification Using Shield TBM Data and Machine Learning Classification Algorithms (쉴드 TBM 데이터와 머신러닝 분류 알고리즘을 이용한 암반 분류 예측에 관한 연구)

  • Kang, Tae-Ho;Choi, Soon-Wook;Lee, Chulho;Chang, Soo-Ho
    • Tunnel and Underground Space
    • /
    • 제31권6호
    • /
    • pp.494-507
    • /
    • 2021
  • With the increasing use of TBM, research has recently been conducted in Korea to analyze TBM data with machine learning techniques to predict the ground in front of TBM, predict the exchange cycle of disk cutters, and predict the advance rate of TBM. In this study, classification prediction of rock characteristics of slurry shield TBM sites was made by combining traditional rock classification techniques and machine learning techniques widely used in various fields with machine data during TBM excavation. The items of rock characteristic classification criteria were set as RQD, uniaxial compression strength, and elastic wave speed, and the rock conditions for each item were classified into three classes: class 0 (good), 1 (normal), and 2 (poor), and machine learning was performed on six class algorithms. As a result, the ensemble model showed good performance, and the LigthtGBM model, which showed excellent results in learning speed as well as learning performance, was found to be optimal in the target site ground. Using the classification model for the three rock characteristics set in this study, it is believed that it will be possible to provide rock conditions for sections where ground information is not provided, which will help during excavation work.

Development of an Automatic Tempo-Regulating Smartphone Application Using MIDI Playback Functions For Musical Instrument Practice (스마트폰 MIDI 재생 기능을 활용한 속도 증가 악기 연습 애플리케이션 개발)

  • Shim, In-Sup
    • Journal of Korea Entertainment Industry Association
    • /
    • 제13권8호
    • /
    • pp.143-150
    • /
    • 2019
  • Playing musical instruments has long been a hobby enjoyed by many, whether amateur or professional. However, a lot of long and arduous practice is required if one wants to acquire the skills of musical artist and truly enjoy the pleasure of playing. This repetitive and tedious practice is often a hindrance to the process of learning a musical instrument, and numerous educators have put a lot of research and effort into making the process easier and more fun for students. In addition, various media practice tools are being developed to keep the students engaged and having fun. The core elements of this content primarily include controlling the speed of backing tracks in accordance with the skill level of students and providing a backing ensemble that enables them to enjoy the fun of playing. This paper studies and compares various MIDI playback techniques capable of controlling speed and pitch in smartphone applications. Modern applications of these techniques are seen in music educational contents, as well as entertainment contents. It also discusses the development and launching of Upbeat, a drum-loop metronome that automatically increases speed by applying different techniques to its respective smartphone operating systems, Android OS and iOS.

Evaluation of Agro-Climatic Index Using Multi-Model Ensemble Downscaled Climate Prediction of CMIP5 (상세화된 CMIP5 기후변화전망의 다중모델앙상블 접근에 의한 농업기후지수 평가)

  • Chung, Uran;Cho, Jaepil;Lee, Eun-Jeong
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • 제17권2호
    • /
    • pp.108-125
    • /
    • 2015
  • The agro-climatic index is one of the ways to assess the climate resources of particular agricultural areas on the prospect of agricultural production; it can be a key indicator of agricultural productivity by providing the basic information required for the implementation of different and various farming techniques and practicalities to estimate the growth and yield of crops from the climate resources such as air temperature, solar radiation, and precipitation. However, the agro-climate index can always be changed since the index is not the absolute. Recently, many studies which consider uncertainty of future climate change have been actively conducted using multi-model ensemble (MME) approach by developing and improving dynamic and statistical downscaling of Global Climate Model (GCM) output. In this study, the agro-climatic index of Korean Peninsula, such as growing degree day based on $5^{\circ}C$, plant period based on $5^{\circ}C$, crop period based on $10^{\circ}C$, and frost free day were calculated for assessment of the spatio-temporal variations and uncertainties of the indices according to climate change; the downscaled historical (1976-2005) and near future (2011-2040) RCP climate sceneries of AR5 were applied to the calculation of the index. The result showed four agro-climatic indices calculated by nine individual GCMs as well as MME agreed with agro-climatic indices which were calculated by the observed data. It was confirmed that MME, as well as each individual GCM emulated well on past climate in the four major Rivers of South Korea (Han, Nakdong, Geum, and Seumjin and Yeoungsan). However, spatial downscaling still needs further improvement since the agro-climatic indices of some individual GCMs showed different variations with the observed indices at the change of spatial distribution of the four Rivers. The four agro-climatic indices of the Korean Peninsula were expected to increase in nine individual GCMs and MME in future climate scenarios. The differences and uncertainties of the agro-climatic indices have not been reduced on the unlimited coupling of multi-model ensembles. Further research is still required although the differences started to improve when combining of three or four individual GCMs in the study. The agro-climatic indices which were derived and evaluated in the study will be the baseline for the assessment of agro-climatic abnormal indices and agro-productivity indices of the next research work.

A Method for Spam Message Filtering Based on Lifelong Machine Learning (Lifelong Machine Learning 기반 스팸 메시지 필터링 방법)

  • Ahn, Yeon-Sun;Jeong, Ok-Ran
    • Journal of IKEEE
    • /
    • 제23권4호
    • /
    • pp.1393-1399
    • /
    • 2019
  • With the rapid growth of the Internet, millions of indiscriminate advertising SMS are sent every day because of the convenience of sending and receiving data. Although we still use methods to block spam words manually, we have been actively researching how to filter spam in a various ways as machine learning emerged. However, spam words and patterns are constantly changing to avoid being filtered, so existing machine learning mechanisms cannot detect or adapt to new words and patterns. Recently, the concept of Lifelong Learning emerged to overcome these limitations, using existing knowledge to keep learning new knowledge continuously. In this paper, we propose a method of spam filtering system using ensemble techniques of naive bayesian which is most commonly used in document classification and LLML(Lifelong Machine Learning). We validate the performance of lifelong learning by applying the model ELLA and the Naive Bayes most commonly used in existing spam filters.

A Study on Injury Severity Prediction for Car-to-Car Traffic Accidents (차대차 교통사고에 대한 상해 심각도 예측 연구)

  • Ko, Changwan;Kim, Hyeonmin;Jeong, Young-Seon;Kim, Jaehee
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • 제19권4호
    • /
    • pp.13-29
    • /
    • 2020
  • Automobiles have long been an essential part of daily life, but the social costs of car traffic accidents exceed 9% of the national budget of Korea. Hence, it is necessary to establish prevention and response system for car traffic accidents. In order to present a model that can classify and predict the degree of injury in car traffic accidents, we used big data analysis techniques of K-nearest neighbor, logistic regression analysis, naive bayes classifier, decision tree, and ensemble algorithm. The performances of the models were analyzed by using the data on the nationwide traffic accidents over the past three years. In particular, considering the difference in the number of data among the respective injury severity levels, we used down-sampling methods for the group with a large number of samples to enhance the accuracy of the classification of the models and then verified the statistical significance of the models using ANOVA.

A Development of a Tailored Follow up Management Model Using the Data Mining Technique on Hypertension (데이터마이닝 기법을 활용한 맞춤형 고혈압 사후관리 모형 개발)

  • Park, Il-Su;Yong, Wang-Sik;Kim, Yu-Mi;Kang, Sung-Hong;Han, Jun-Tae
    • The Korean Journal of Applied Statistics
    • /
    • 제21권4호
    • /
    • pp.639-647
    • /
    • 2008
  • This study used the characteristics of the knowledge discovery and data mining algorithms to develop tailored hypertension follow up management model - hypertension care predictive model and hypertension care compliance segmentation model - for hypertension management using the Korea National Health Insurance Corporation database(the insureds’ screening and health care benefit data). This study validated the predictive power of data mining algorithms by comparing the performance of logistic regression, decision tree, and ensemble technique. On the basis of internal and external validation, it was found that the model performance of logistic regression method was the best among the above three techniques on hypertension care predictive model and hypertension care compliance segmentation model was developed by Decision tree analysis. This study produced several factors affecting the outbreak of hypertension using screening. It is considered to be a contributing factor towards the nation’s building of a Hypertension follow up Management System in the near future by bringing forth representative results on the rise and care of hypertension.

A Study on the Near Wake of a Square Cylinder Using Particle Image Velocimetry ( I )- Mean Flow Field - (PIV기법을 이용한 정사각 실린더의 근접후류에 관한 연구 (I) - 평균유동장 -)

  • Lee, Man-Bok;Kim, Gyeong-Cheon
    • Transactions of the Korean Society of Mechanical Engineers B
    • /
    • 제25권10호
    • /
    • pp.1408-1416
    • /
    • 2001
  • Mean flow fields in the near wake of a square cylinder have been studied experimentally using a Particle Image Velocimetry (PIV). Ensemble-averaged velocity fields are successfully measured fur the square cylinder wake including the reverse flow region which arises many difficulties in accurate measurement by using conventional techniques, Experiments are performed at two free stream velocities of U$\_$$\infty$/ = 1.27m/s and 3.03m/s. The corresponding Reynolds numbers based on the free-stream velocity and cylinder diameter are 1600 and 3900, respectively. The intensity of free-stream turbulence is less than 1%, the blockage ratio (D/H) is 6.6% and the aspect ratio (W/D) is 40. The effect of Reynolds number on the near wake of a square cylinder has been investigated by the global mean velocity and instantaneous velocity fields. The most striking feature is that the length of the recirculating region increases with increasing Reynolds number, which turns out totally reverse trend compared with those observed in the circular cylinder wake at the same range of Reynolds number. Fer the case of higher Reynolds number, the mean velocity data agree well with those of relevant existing data obtained at much higher Reynolds numbers, which reflects the general aspect of sharp-edged bluff body wake.

Development of Hypertension Predictive Model (고혈압 발생 예측 모형 개발)

  • Yong, Wang-Sik;Park, Il-Su;Kang, Sung-Hong;Kim, Won-Joong;Kim, Kong-Hyun;Kim, Kwang-Kee;Park, No-Yai
    • Korean Journal of Health Education and Promotion
    • /
    • 제23권4호
    • /
    • pp.13-28
    • /
    • 2006
  • Objectives: This study used the characteristics of the knowledge discovery and data mining algorithms to develop hypertension predictive model for hypertension management using the Korea National Health Insurance Corporation database(the insureds' screening and health care benefit data). Methods: This study validated the predictive power of data mining algorithms by comparing the performance of logistic regression, decision tree, and ensemble technique. On the basis of internal and external validation, it was found that the model performance of logistic regression method was the best among the above three techniques. Results: Major results of logistic regression analysis suggested that the probability of hypertension was: - lower for the female(compared with the male)(OR=0.834) - higher for the persons whose ages were 60 or above(compared with below 40)(OR=4.628) - higher for obese persons(compared with normal persons)(OR= 2.103) - higher for the persons with high level of glucose(compared with normal persons)(OR=1.086) - higher for the persons who had family history of hypertension(compared with the persons who had not)(OR=1.512) - higher for the persons who periodically drank alcohol(compared with the persons who did not)$(OR=1.037{\sim}1.291)$ Conclusions: This study produced several factors affecting the outbreak of hypertension using screening. It is considered to be a contributing factor towards the nation's building of a Hypertension Management System in the near future by bringing forth representative results on the rise and care of hypertension.

Development of a software framework for sequential data assimilation and its applications in Japan

  • Noh, Seong-Jin;Tachikawa, Yasuto;Shiiba, Michiharu;Kim, Sun-Min;Yorozu, Kazuaki
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 한국수자원학회 2012년도 학술발표회
    • /
    • pp.39-39
    • /
    • 2012
  • Data assimilation techniques have received growing attention due to their capability to improve prediction in various areas. Despite of their potentials, applicable software frameworks to probabilistic approaches and data assimilation are still limited because the most of hydrologic modelling software are based on a deterministic approach. In this study, we developed a hydrological modelling framework for sequential data assimilation, namely MPI-OHyMoS. MPI-OHyMoS allows user to develop his/her own element models and to easily build a total simulation system model for hydrological simulations. Unlike process-based modelling framework, this software framework benefits from its object-oriented feature to flexibly represent hydrological processes without any change of the main library. In this software framework, sequential data assimilation based on the particle filters is available for any hydrologic models considering various sources of uncertainty originated from input forcing, parameters and observations. The particle filters are a Bayesian learning process in which the propagation of all uncertainties is carried out by a suitable selection of randomly generated particles without any assumptions about the nature of the distributions. In MPI-OHyMoS, ensemble simulations are parallelized, which can take advantage of high performance computing (HPC) system. We applied this software framework for several catchments in Japan using a distributed hydrologic model. Uncertainty of model parameters and radar rainfall estimates is assessed simultaneously in sequential data assimilation.

  • PDF