• Title/Summary/Keyword: Decision Tree analysis

Search Result 725, Processing Time 0.025 seconds

An Analysis of Panel Attrition in GOMS(Graduates Occupational Survey) (대졸자 직업이동 경로조사에서 패널탈락분석)

  • Chun, Young-Min;Yoon, Jeong-Hye;Oh, Min-Hong
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.5
    • /
    • pp.981-993
    • /
    • 2009
  • It would cause a serious problem in the panel data when panel attrition is concentrated on certain socioeconomic groups. Using the GOMS, this study investigates whether there exists non-random attrition bias in the data and seeks for feasible solutions to minimize the bias. The results of logit analyses show that panel attrition in the GOMS results mainly from surveying system but not from the surveyed. Therefore, the result suggests to develop well-organized management skill and systems as well as to construct weighting methods.

A Method of Predicting Service Time Based on Voice of Customer Data (고객의 소리(VOC) 데이터를 활용한 서비스 처리 시간 예측방법)

  • Kim, Jeonghun;Kwon, Ohbyung
    • Journal of Information Technology Services
    • /
    • v.15 no.1
    • /
    • pp.197-210
    • /
    • 2016
  • With the advent of text analytics, VOC (Voice of Customer) data become an important resource which provides the managers and marketing practitioners with consumer's veiled opinion and requirements. In other words, making relevant use of VOC data potentially improves the customer responsiveness and satisfaction, each of which eventually improves business performance. However, unstructured data set such as customers' complaints in VOC data have seldom used in marketing practices such as predicting service time as an index of service quality. Because the VOC data which contains unstructured data is too complicated form. Also that needs convert unstructured data from structure data which difficult process. Hence, this study aims to propose a prediction model to improve the estimation accuracy of the level of customer satisfaction by combining unstructured from textmining with structured data features in VOC. Also the relationship between the unstructured, structured data and service processing time through the regression analysis. Text mining techniques, sentiment analysis, keyword extraction, classification algorithms, decision tree and multiple regression are considered and compared. For the experiment, we used actual VOC data in a company.

A Review of the Methodology for Sophisticated Data Classification (정교한 데이터 분류를 위한 방법론의 고찰)

  • Kim, Seung Jae;Kim, Sung Hwan
    • Journal of Integrative Natural Science
    • /
    • v.14 no.1
    • /
    • pp.27-34
    • /
    • 2021
  • 전 세계적으로 인공지능(AI)을 구현하려는 움직임이 많아지고 있다. AI구현에서는 많은 양의 데이터, 목적에 맞는 데이터의 분류 등 데이터의 중요성을 뺄 수 없다. 이러한 데이터를 생성하고 가공하는 기술에는 사물인터넷(IOT)과 빅데이터(Big-data) 분석이 있으며 4차 산업을 이끌어 가는 원동력이라 할 수 있다. 또한 이러한 기술은 국가와 개인 차원에서 많이 활용되고 있으며, 특히나 특정분야에 집결되는 데이터를 기준으로 빅데이터 분석에 활용함으로써 새로운 모델을 발견하고, 그 모델로 새로운 값을 추론하고 예측함으로써 미래비전을 제시하려는 시도가 많아지고 있는 추세이다. 데이터 분석을 통한 결론은 데이터가 가지고 있는 정보의 정확성에 따라 많은 변화를 가져올 수 있으며, 그 변화에 따라 잘못된 결과를 발생시킬 수도 있다. 이렇듯 데이터의 분석은 데이터가 가지는 정보 또는 분석 목적에 맞는 데이터 분류가 매우 중요하다는 것을 알 수 있다. 또한 빅데이터 분석결과 통계량의 신뢰성과 정교함을 얻기 위해서는 각 변수의 의미와 변수들 간의 상관관계, 다중공선성 등을 고려하여 분석해야 한다. 즉, 빅데이터 분석에 앞서 분석목적에 맞도록 데이터의 분류가 잘 이루어지도록 해야 한다. 이에 본 고찰에서는 AI기술을 구현하는 머신러닝(machine learning, ML) 기법에 속하는 분류분석(classification analysis, CA) 중 의사결정트리(decision tree, DT)기법, 랜덤포레스트(random forest, RF)기법, 선형분류분석(linear discriminant analysis, LDA), 이차선형분류분석(quadratic discriminant analysis, QDA)을 이용하여 데이터를 분류한 후 데이터의 분류정도를 평가함으로써 데이터의 분류 분석률 향상을 위한 방안을 모색하려 한다.

Comparative Analysis of Machine Learning Algorithms for Healthy Management of Collaborative Robots (협동로봇의 건전성 관리를 위한 머신러닝 알고리즘의 비교 분석)

  • Kim, Jae-Eun;Jang, Gil-Sang;Lim, KuK-Hwa
    • Journal of the Korea Safety Management & Science
    • /
    • v.23 no.4
    • /
    • pp.93-104
    • /
    • 2021
  • In this paper, we propose a method for diagnosing overload and working load of collaborative robots through performance analysis of machine learning algorithms. To this end, an experiment was conducted to perform pick & place operation while changing the payload weight of a cooperative robot with a payload capacity of 10 kg. In this experiment, motor torque, position, and speed data generated from the robot controller were collected, and as a result of t-test and f-test, different characteristics were found for each weight based on a payload of 10 kg. In addition, to predict overload and working load from the collected data, machine learning algorithms such as Neural Network, Decision Tree, Random Forest, and Gradient Boosting models were used for experiments. As a result of the experiment, the neural network with more than 99.6% of explanatory power showed the best performance in prediction and classification. The practical contribution of the proposed study is that it suggests a method to collect data required for analysis from the robot without attaching additional sensors to the collaborative robot and the usefulness of a machine learning algorithm for diagnosing robot overload and working load.

Analysis System for Traffic Accident based on WEB (WEB 기반 교통사고 분석)

  • Hong, You-Sik;Han, Chang-Pyoung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.22 no.6
    • /
    • pp.13-20
    • /
    • 2022
  • Road conditions and weather conditions are very important factors in the case of traffic accident fatalities in fog and ice sections that occur on roads in winter. In this paper, a simulation was performed to estimate the traffic accident risk rate assuming traffic accident prediction data. In addition, in this paper, in order to reduce traffic accidents and prevent traffic accidents, factor analysis and traffic accident fatality rates were predicted using the WEKA data mining technique and TENSOR FLOW open source data on traffic accident fatalities provided by the Korea Transportation Corporation.

Development of Thinning Effect Analysis Model (TEAM) Using Individual-Tree Distance-Independent Growth Model of Pinus koraiensis Stands (잣나무 임분의 개체목 거리독립생장모델을 이용한 간벌효과 분석모델 개발)

  • Kwon, Soonduk;Kim, Seonyoung;Chung, Joosang;Kim, Hyung-Ho
    • Journal of Korean Society of Forest Science
    • /
    • v.96 no.6
    • /
    • pp.742-749
    • /
    • 2007
  • The objective of this study was to develop thinning effect analysis model (TEAM) using individual-tree distance-independent growth model of Pinus koraiensis Stands. The TEAM was designed to analyze thinning effects associated with such thinning prescriptions as the number, timing, intensity, and method of thinnings. To testing TEAM application, stand growth effects were compared with seven scenarios according to thinning prescription plan. In the results, it was possible to estimate the number of trees, height, volume with diameter (DBH) class of individual trees, and average diameter growth, height growth, the number of trees and volume growth per ha of stands. The result of sensitivity analysis on one Pinus koraiensis stand, it was not sure to expect the much more volume at the rotation age by stand density control applying thinning prescription. In the case of thinning, total yield volume has much more $40{\sim}75m^3$ per ha, within 5 cm in average diameter growth and within 1 m in average height growth than thats of non-thinning over increasing stand age. TEAM, as decision making support system, can be used for selecting the thinning prescription trial and determining one of some thinning prescription plan in different site specific stand environments.

The detection of cavitation in hydraulic machines by use of ultrasonic signal analysis

  • Gruber, P.;Farhat, M.;Odermatt, P.;Etterlin, M.;Lerch, T.;Frei, M.
    • International Journal of Fluid Machinery and Systems
    • /
    • v.8 no.4
    • /
    • pp.264-273
    • /
    • 2015
  • This presentation describes an experimental approach for the detection of cavitation in hydraulic machines by use of ultrasonic signal analysis. Instead of using the high frequency pulses (typically 1MHz) only for transit time measurement different other signal characteristics are extracted from the individual signals and its correlation function with reference signals in order to gain knowledge of the water conditions. As the pulse repetition rate is high (typically 100Hz), statistical parameters can be extracted of the signals. The idea is to find patterns in the parameters by a classifier that can distinguish between the different water states. This classification scheme has been applied to different cavitation sections: a sphere in a water flow in circular tube at the HSLU in Lucerne, a NACA profile in a cavitation tunnel and two Francis model test turbines all at LMH in Lausanne. From the signal raw data several statistical parameters in the time and frequency domain as well as from the correlation function with reference signals have been determined. As classifiers two methods were used: neural feed forward networks and decision trees. For both classification methods realizations with lowest complexity as possible are of special interest. It is shown that two to three signal characteristics, two from the signal itself and one from the correlation function are in many cases sufficient for the detection capability. The final goal is to combine these results with operating point, vibration, acoustic emission and dynamic pressure information such that a distinction between dangerous and not dangerous cavitation is possible.

An Application of Support Vector Machines to Personal Credit Scoring: Focusing on Financial Institutions in China (Support Vector Machines을 이용한 개인신용평가 : 중국 금융기관을 중심으로)

  • Ding, Xuan-Ze;Lee, Young-Chan
    • Journal of Industrial Convergence
    • /
    • v.16 no.4
    • /
    • pp.33-46
    • /
    • 2018
  • Personal credit scoring is an effective tool for banks to properly guide decision profitably on granting loans. Recently, many classification algorithms and models are used in personal credit scoring. Personal credit scoring technology is usually divided into statistical method and non-statistical method. Statistical method includes linear regression, discriminate analysis, logistic regression, and decision tree, etc. Non-statistical method includes linear programming, neural network, genetic algorithm and support vector machine, etc. But for the development of the credit scoring model, there is no consistent conclusion to be drawn regarding which method is the best. In this paper, we will compare the performance of the most common scoring techniques such as logistic regression, neural network, and support vector machines using personal credit data of the financial institution in China. Specifically, we build three models respectively, classify the customers and compare analysis results. According to the results, support vector machine has better performance than logistic regression and neural networks.

Analysis of Feature Importance of Ship's Berthing Velocity Using Classification Algorithms of Machine Learning (머신러닝 분류 알고리즘을 활용한 선박 접안속도 영향요소의 중요도 분석)

  • Lee, Hyeong-Tak;Lee, Sang-Won;Cho, Jang-Won;Cho, Ik-Soon
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.26 no.2
    • /
    • pp.139-148
    • /
    • 2020
  • The most important factor affecting the berthing energy generated when a ship berths is the berthing velocity. Thus, an accident may occur if the berthing velocity is extremely high. Several ship features influence the determination of the berthing velocity. However, previous studies have mostly focused on the size of the vessel. Therefore, the aim of this study is to analyze various features that influence berthing velocity and determine their respective importance. The data used in the analysis was based on the berthing velocity of a ship on a jetty in Korea. Using the collected data, machine learning classification algorithms were compared and analyzed, such as decision tree, random forest, logistic regression, and perceptron. As an algorithm evaluation method, indexes according to the confusion matrix were used. Consequently, perceptron demonstrated the best performance, and the feature importance was in the following order: DWT, jetty number, and state. Hence, when berthing a ship, the berthing velocity should be determined in consideration of various features, such as the size of the ship, position of the jetty, and loading condition of the cargo.

A Design Solution for a Railway Switch Monitoring System (분기기 진단 시스템 설계에 관한 연구)

  • Choo, Eun-Sang;Kim, Min-Seong;Yoo, Heung-Yeol;Mo, Choong-Seon;Son, Eui-Sik;Park, Seongguen;Lee, Jong-Woo
    • Journal of the Korean Society for Railway
    • /
    • v.18 no.5
    • /
    • pp.439-446
    • /
    • 2015
  • The turnout system, which determines the direction of the train, is not only a key system but also a vulnerable system. Failure of this system may lead to a delay of the train or even casualties. In this light, it is necessary to precisely the conditions of the turnout system. Currently, ROADMASTER of Germany is used as a diagnostic system in Korea. However, a new diagnostic system should be developed for optimized operation of the turnout system with maintenance that is suitable for the Korean railway environment. In this paper, a Fault Tree Analysis for the representative faults of the turnout system is conducted and physical quantities, which can be the cause of the fault, are classified according to the component and function. Also, the measuring factors for the monitoring are derived and a decision making theory is suggested. On the basis of the results, we propose a new turnout diagnostic system that can provide more driverse and precise information than the conventional system.