• Title/Summary/Keyword: CART 알고리즘

Search Result 64, Processing Time 0.033 seconds

Selection of an Optimal Algorithm among Decision Tree Techniques for Feature Analysis of Industrial Accidents in Construction Industries (건설업의 산업재해 특성분석을 위한 의사결정나무 기법의 상용 최적 알고리즘 선정)

  • Leem Young-Moon;Choi Yo-Han
    • Journal of the Korea Safety Management & Science
    • /
    • v.7 no.5
    • /
    • pp.1-8
    • /
    • 2005
  • The consequences of rapid industrial advancement, diversified types of business and unexpected industrial accidents have caused a lot of damage to many unspecified persons both in a human way and a material way Although various previous studies have been analyzed to prevent industrial accidents, these studies only provide managerial and educational policies using frequency analysis and comparative analysis based on data from past industrial accidents. The main objective of this study is to find an optimal algorithm for data analysis of industrial accidents and this paper provides a comparative analysis of 4 kinds of algorithms including CHAID, CART, C4.5, and QUEST. Decision tree algorithm is utilized to predict results using objective and quantified data as a typical technique of data mining. Enterprise Miner of SAS and AnswerTree of SPSS will be used to evaluate the validity of the results of the four algorithms. The sample for this work chosen from 19,574 data related to construction industries during three years ($2002\sim2004$) in Korea.

Enhanced Recommendation Algorithm using Semantic Collaborative Filtering: E-commerce Portal (전자상거래 포탈을 위한 시맨틱 협업 필터링을 이용한 확장된 추천 알고리즘)

  • Ahmed, Shohel;Kim, Jong-Woo;Kang, Sang-Gil
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.3
    • /
    • pp.79-98
    • /
    • 2011
  • This paper proposes a semantic recommendation technique for a personalized e-commerce portal. Semantic recommendation is achieved by utilizing the attributes of products. The semantic similarity of the products is merged with the rating information of the products to provide an accurate recommendation. The recommendation technique also analyzes various attitudes of the customer to evaluate the implicit rating of products. Attitudes are classifies into three types such as "purchasing product", "adding product to shopping cart", and "viewing the product information." We implicitly track customer attitude to estimate the rating of products for recommending products. Also we implement a session validation process to identify the valid sessions that are highly important for giving an accurate recommendation. Our recommendation technique shows a high degree of accuracy as we use age groupings of customers with similar preferences. The experimental section shows that our proposed recommendation method outperforms well known collaborative filtering methods not only for the existing customer, but also for the new user with no previous purchase record.

Context Aware Feature Selection Model for Salient Feature Detection from Mobile Video Devices (모바일 비디오기기 위에서의 중요한 객체탐색을 위한 문맥인식 특성벡터 선택 모델)

  • Lee, Jaeho;Shin, Hyunkyung
    • Journal of Internet Computing and Services
    • /
    • v.15 no.6
    • /
    • pp.117-124
    • /
    • 2014
  • Cluttered background is a major obstacle in developing salient object detection and tracking system for mobile device captured natural scene video frames. In this paper we propose a context aware feature vector selection model to provide an efficient noise filtering by machine learning based classifiers. Since the context awareness for feature selection is achieved by searching nearest neighborhoods, known as NP hard problem, we apply a fast approximation method with complexity analysis in details. Separability enhancement in feature vector space by adding the context aware feature subsets is studied rigorously using principal component analysis (PCA). Overall performance enhancement is quantified by the statistical measures in terms of the various machine learning models including MLP, SVM, Naïve Bayesian, CART. Summary of computational costs and performance enhancement is also presented.

The Factors of Participating in a Smoking Cessation Program using Integrated Method of Decision Tree and Neural Network Algorithm (인공신경망 분석과 결정트리 융합에 의한 금연 프로그램 참여 결정 요인)

  • Byeon, Haewon
    • Journal of the Korea Convergence Society
    • /
    • v.6 no.2
    • /
    • pp.25-30
    • /
    • 2015
  • The purpose of this study was to analyze the factors that affects the participating in a smoking cessation program. Data were from the A Study on the Seoul Welfare Panel Study 2010. Subjects were 1,326 smokers aged 19 and older living in the community. Dependent variable was defined as experience of smoking cessation. Explanatory variables were included as age, gender, level of education, employment status, household income, marital status, drinking, self-reported health status, depression, disease, and physical activity. A prediction model was developed by the use of a Decision Tree and Neural Network Algorithm. In the Prediction model, self reported health status, disease, income, household income were significantly associated with participating in a smoking cessation program. Based this study, systematic education and development of programs are required.

Inflow and outflow analysis of double majors using social network analysis (사회 연결망 분석을 이용한 복수전공 유입 및 유출 분석)

  • Cho, Jang-Sik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.4
    • /
    • pp.693-701
    • /
    • 2012
  • Recently, the number of students who get double majors has tended to increase in many universities. As results, many problems occur because immoderate inflow of double-major students is concentrated in a specific popular department. In this paper, we study the characteristic of inflow and outflow of double majors using social network analysis and decision tree analysis. According to the results, SAT score affected the inflow of double majors the most. Additionally, department category, course evaluation score, employment rate also affected the inflow of double majors in the order named. On the other hand, department category affected the outflow of double majors the most. Additionally, SAT score, employment rate, course evaluation score also affected the outflow of double majors in the order named.

LMI Design of Multi-Objective$ Η_2/Η_\infty$Controllers for an Inverted Pendulum on the Cart Using Polytope Models (폴리토프 모델을 이용한 도립진자의 다목적$ Η_2/Η_\infty$ 제어기의 LMI 설계)

  • 이상철
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.6 no.1
    • /
    • pp.6-13
    • /
    • 2002
  • This paper deals with the linear matrix inequality (LMI) design procedures for multi-objective Η$_2$$_{\infty}$ controllers with pole-placement constraints for an inverted pendulum system modeled as convex polytopes to ensure the stabilizing regulator and tracking performances. Polytopic models with multiple linear time-invariant models linearized at some operating points are derived to design controllers overcoming the conservativeness such as a controller may have when it is designed for a model linearized at a single operating point. Multi-objective controllers are designed for polytopic models by the LMT design technique with convex algorithms. It is observed that the inverted pendulum controlled by any controller designed for each polytopic model is stabilizingly restored to the vertical angle position for initial values of larger tilt anlges.

Satellite-Based Vegetation Drought Response Index in Korea (VegDRI-Korea) for Drought Monitoring (한반도 가뭄 모니터링을 위한 위성영상기반 식생가뭄반응지수 (VegDRI)의 활용)

  • Nam, Won-Ho;Tadesse, Tsegaye;Wardlow, Brian D.;Hong, Eun-Mi;Pachepsky, Yakov A.
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2017.05a
    • /
    • pp.382-382
    • /
    • 2017
  • 최근 전 세계적으로 가뭄 재해가 증가함에 따라 국내의 경우 가뭄상황을 모니터링하기 위하여 다양한 유관 기간에서 가뭄정보시스템을 활용하여 가뭄지수를 공간지도 형태로 제공하고 있다. 기상청 수자원공사 농어촌공사 등에서 기상/수문/농업관련 가뭄지수의 위험지도를 실시간으로 제공하고 있으며 각 지표별로 수문기상학적 특징과 용수공급시설 및 수요공급의 이수상황 등을 고려하여 활용하고 있다. 하지만 제공되고 있는 가뭄지수의 공간분포는 지점 자료를 기반으로 내삽기법 (interpolation)을 통해 재 산정된 지도로 공간 해상도 측면에서 조악한 해상도를 갖고 있다. 이와 같은 한계점을 보완하기 위하여 시 공간적으로 특성이 동일한 광범위한 지역에 대한 정보를 주기적으로 제공 가능하다는 측면에서 위성영상자료를 활용한 가뭄모니터링 연구의 필요성이 요구된다. 본 연구에서는 위성영상을 이용한 식생 정보 및 기후 정보 생물물리학적 정보를 활용한 식생가뭄반응지수 (Vegetation Drought Response Index in Korea VegDRI-Korea)를 제시하고 국내의 적용성 검증을 위하여 국내 주요 가뭄 사상을 대상으로 시공간적 가뭄상황을 분석하였다. 식생가뭄반응지수는 유역단위 또는 행정구역 단위별로 실시간 가뭄 상황을 분석할 수 있는 고해상도 위성영상 기반의 가뭄지수로써 향후 한반도 전역의 가뭄모니터링 및 주기적인 모니터링을 통해 가뭄예상지역 판단에 대한 의사결정지원에 활용할 수 있다.

  • PDF

Machine Learning Algorithm for Estimating Ink Usage (머신러닝을 통한 잉크 필요량 예측 알고리즘)

  • Se Wook Kwon;Young Joo Hyun;Hyun Chul Tae
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.46 no.1
    • /
    • pp.23-31
    • /
    • 2023
  • Research and interest in sustainable printing are increasing in the packaging printing industry. Currently, predicting the amount of ink required for each work is based on the experience and intuition of field workers. Suppose the amount of ink produced is more than necessary. In this case, the rest of the ink cannot be reused and is discarded, adversely affecting the company's productivity and environment. Nowadays, machine learning models can be used to figure out this problem. This study compares the ink usage prediction machine learning models. A simple linear regression model, Multiple Regression Analysis, cannot reflect the nonlinear relationship between the variables required for packaging printing, so there is a limit to accurately predicting the amount of ink needed. This study has established various prediction models which are based on CART (Classification and Regression Tree), such as Decision Tree, Random Forest, Gradient Boosting Machine, and XGBoost. The accuracy of the models is determined by the K-fold cross-validation. Error metrics such as root mean squared error, mean absolute error, and R-squared are employed to evaluate estimation models' correctness. Among these models, XGBoost model has the highest prediction accuracy and can reduce 2134 (g) of wasted ink for each work. Thus, this study motivates machine learning's potential to help advance productivity and protect the environment.

An Study on the Correlation between Sound Characteristics and Sasang Constitution by CSL (CSL을 통한 음향특성과 사상체질간의 상관성 연구)

  • Shin, Mi-ran;Kim, Dal-lae
    • Journal of Sasang Constitutional Medicine
    • /
    • v.11 no.1
    • /
    • pp.137-157
    • /
    • 1999
  • The purpose of this study is to help classifying Sasang Constitution through correlation with sound characteristic. This study was done it under the suppose that Sasang Constitution has correlation with sound spectrogram. The following result were obtained about correlation between sound spectrogram and Sasang Constitution by comparison and analysis 1. Soeumin answered his voice low tone, smooth and quiet in the survey. Soyangin answered his voice high, clear, fast and speaking random. Taeumin answered his voice low, thick and muddy. 2. Taeyangin was significantly slow compared with the others in the time of reading composition. Taeyangin was significantly slow compared with the others in Formant frequency 1. Taeyangin was significantly discriminated from Soeumin in Formant frequency 5. Taeyangin was significantly low compared with the others in Bandwidth 2. Soeumln was significantly low compared with Taeyangin in Pitch Maximum and Pitch Maximum-Pitch Minimum. Taeyangin was significantly high compared with the others in Energy mean. 3. In list of specification, the discrimination rate was higher than that by lists of 13 in the results of Multi-dimensional 4-class minimum-distance. The discrimination rate of three disposition except Soyangin was higher than that of four disposition in the results of One way ANOVA and Analysis of dis crimination in SPSS/PC+. In CART, the estimate rate of Sasang Constitution discrimination was higher than any other method. It is considered that there is a correlation between sound spectrogram and Sasang constitution according to the results. And method of Sasang constitution classification through sound spectrogram analysis can be one method as assistant for the objectification of Sasang constitution classification.

  • PDF

Building battery deterioration prediction model using real field data (머신러닝 기법을 이용한 납축전지 열화 예측 모델 개발)

  • Choi, Keunho;Kim, Gunwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.243-264
    • /
    • 2018
  • Although the worldwide battery market is recently spurring the development of lithium secondary battery, lead acid batteries (rechargeable batteries) which have good-performance and can be reused are consumed in a wide range of industry fields. However, lead-acid batteries have a serious problem in that deterioration of a battery makes progress quickly in the presence of that degradation of only one cell among several cells which is packed in a battery begins. To overcome this problem, previous researches have attempted to identify the mechanism of deterioration of a battery in many ways. However, most of previous researches have used data obtained in a laboratory to analyze the mechanism of deterioration of a battery but not used data obtained in a real world. The usage of real data can increase the feasibility and the applicability of the findings of a research. Therefore, this study aims to develop a model which predicts the battery deterioration using data obtained in real world. To this end, we collected data which presents change of battery state by attaching sensors enabling to monitor the battery condition in real time to dozens of golf carts operated in the real golf field. As a result, total 16,883 samples were obtained. And then, we developed a model which predicts a precursor phenomenon representing deterioration of a battery by analyzing the data collected from the sensors using machine learning techniques. As initial independent variables, we used 1) inbound time of a cart, 2) outbound time of a cart, 3) duration(from outbound time to charge time), 4) charge amount, 5) used amount, 6) charge efficiency, 7) lowest temperature of battery cell 1 to 6, 8) lowest voltage of battery cell 1 to 6, 9) highest voltage of battery cell 1 to 6, 10) voltage of battery cell 1 to 6 at the beginning of operation, 11) voltage of battery cell 1 to 6 at the end of charge, 12) used amount of battery cell 1 to 6 during operation, 13) used amount of battery during operation(Max-Min), 14) duration of battery use, and 15) highest current during operation. Since the values of the independent variables, lowest temperature of battery cell 1 to 6, lowest voltage of battery cell 1 to 6, highest voltage of battery cell 1 to 6, voltage of battery cell 1 to 6 at the beginning of operation, voltage of battery cell 1 to 6 at the end of charge, and used amount of battery cell 1 to 6 during operation are similar to that of each battery cell, we conducted principal component analysis using verimax orthogonal rotation in order to mitigate the multiple collinearity problem. According to the results, we made new variables by averaging the values of independent variables clustered together, and used them as final independent variables instead of origin variables, thereby reducing the dimension. We used decision tree, logistic regression, Bayesian network as algorithms for building prediction models. And also, we built prediction models using the bagging of each of them, the boosting of each of them, and RandomForest. Experimental results show that the prediction model using the bagging of decision tree yields the best accuracy of 89.3923%. This study has some limitations in that the additional variables which affect the deterioration of battery such as weather (temperature, humidity) and driving habits, did not considered, therefore, we would like to consider the them in the future research. However, the battery deterioration prediction model proposed in the present study is expected to enable effective and efficient management of battery used in the real filed by dramatically and to reduce the cost caused by not detecting battery deterioration accordingly.