• Title/Summary/Keyword: Decision Tree analysis

Search Result 725, Processing Time 0.026 seconds

A Study for the Development of a Bid Price Rate Prediction Model (낙찰률 예측 모형에 관한 연구)

  • Choi, Bo-Seung;Kang, Hyun-Cheol;Han, Sang-Tae
    • Communications for Statistical Applications and Methods
    • /
    • v.18 no.1
    • /
    • pp.23-34
    • /
    • 2011
  • Property auctions have become a new method for real estate investment because the property auction market grows in tandem with the growth of the real estate market. This study focused on the statistical model for predicting bid price rates which is the main index for participants in the real estate auction market. For estimating the monthly bid price rate, we proposed a new method to make up for the mean of regions and terms as well as to reduce the prediction error using a decision tree analysis. We also proposed a linear regression model to predict a bid price rate for individual auction property. We applied the proposed model to apartment auction property and tried to predict the bid price rate as well as categorize individual auction property into an auction grade.

The Creativity Forecasting of Design Idea Sketches According to the Ambiguity of Visual Stimuli and Idea-Sharing Situations (시각자극의 모호함과 아이디어 교류의 유무에 따른 디자인 아이디어의 창의성 예측)

  • Jang, Sun Hee
    • The Journal of the Korea Contents Association
    • /
    • v.16 no.4
    • /
    • pp.275-288
    • /
    • 2016
  • A decision tree analysis was performed by categorizing the idea sketches produced in the group environment into three different levels of visual stimuli ambiguity (vague, ambiguous, and definite) and two idea-sharing situations (before and after). We then examined the predicted values for the creativity of each group's idea sketches, the factors that led to high creativity scores, and their standards. The results of the analyses indicated that the Resistance to Premature Closure, Originality, Elaboration, ness, and Similarity represented important predictors of the creativity of design idea sketches according to the level of ambiguity of the visual stimuli used and whether ideas were shared or not. The group presented with vague stimuli after sharing ideas scored the highest predicted creativity value and the group presented with definite stimuli after sharing ideas scored the lowest predicted creativity value.

Pattern Analysis of Core Competency of CEO Using Fuzzy ID3 (퍼지 ID3를 이용한 CEO핵심역량의 패턴분석)

  • Park, Bong-Gyeong;Hwang, Seung-Gook
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.20 no.2
    • /
    • pp.273-278
    • /
    • 2010
  • A few small and medium enterprise administer its organization systematically, but most of them is affected by ability and level of a CEO rather than organization system. In this viewpoint, it can be said the study on ability and level of CEO in small and medium enterprise are so meaningful. Thus, in this paper, the core competency of CEO is obtained from the CEO through questionnaire and it is suggested the evaluation model of the CEO core competency. Also patterns were analyzed by ID3 and fuzzy ID3 from data on expert appraise for CEO core competency and level. The 'if-then' fuzzy rules and decision tree created by results of pattern analysis showed their usefulness for evaluation of CEO core competency in small and medium enterprise.

Analysis and Detection Method for Line-shaped Echoes using Support Vector Machine (Support Vector Machine을 이용한 선에코 특성 분석 및 탐지 방법)

  • Lee, Hansoo;Kim, Eun Kyeong;Kim, Sungshin
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.24 no.6
    • /
    • pp.665-670
    • /
    • 2014
  • A SVM is a kind of binary classifier in order to find optimal hyperplane which separates training data into two groups. Due to its remarkable performance, the SVM is applied in various fields such as inductive inference, binary classification or making predictions. Also it is a representative black box model; there are plenty of actively discussed researches about analyzing trained SVM classifier. This paper conducts a study on a method that is automatically detecting the line-shaped echoes, sun strobe echo and radial interference echo, using the SVM algorithm because the line-shaped echoes appear relatively often and disturb weather forecasting process. Using a spatial clustering method and corrected reflectivity data in the weather radar, the training data is made up with mean reflectivity, size, appearance, centroid altitude and so forth. With actual occurrence cases of the line-shaped echoes, the trained SVM classifier is verified, and analyzed its characteristics using the decision tree method.

Web Document Classification Based on Hangeul Morpheme and Keyword Analyses (한글 형태소 및 키워드 분석에 기반한 웹 문서 분류)

  • Park, Dan-Ho;Choi, Won-Sik;Kim, Hong-Jo;Lee, Seok-Lyong
    • The KIPS Transactions:PartD
    • /
    • v.19D no.4
    • /
    • pp.263-270
    • /
    • 2012
  • With the current development of high speed Internet and massive database technology, the amount of web documents increases rapidly, and thus, classifying those documents automatically is getting important. In this study, we propose an effective method to extract document features based on Hangeul morpheme and keyword analyses, and to classify non-structured documents automatically by predicting subjects of those documents. To extract document features, first, we select terms using a morpheme analyzer, form the keyword set based on term frequency and subject-discriminating power, and perform the scoring for each keyword using the discriminating power. Then, we generate the classification model by utilizing the commercial software that implements the decision tree, neural network, and SVM(support vector machine). Experimental results show that the proposed feature extraction method has achieved considerable performance, i.e., average precision 0.90 and recall 0.84 in case of the decision tree, in classifying the web documents by subjects.

Case Control Study Identifying the Predictors of Unplanned Intensive Care Unit Readmission After Discharge (집중치료실 퇴실환자의 비계획성 재입실 예측 인자를 규명하기 위한 사례대조군 연구)

  • Park, Myoung Ok;Oh, Hyun Soo
    • Journal of Korean Critical Care Nursing
    • /
    • v.11 no.3
    • /
    • pp.45-57
    • /
    • 2018
  • Purpose : This study was performed to identify the influencing factors of unplanned intensive care unit (ICU) readmission. Methods : The study adopted a Rretrospective case control cohort design. Data were collected from the electronic medical records of 844 patients who had been discharged from the ICUs of a university hospital in Incheon from June 2014 to December 2014. Results : The study found the unplanned ICU readmission rate was to be 6.4%(n=54). From the univariate analysis revealed that, major symptoms at $1^{st}$ ICU admission, severity at $1^{st}$ ICU admission (CPSCS and APACHE II), duration of applying ventilator application during $1^{st}$ ICU admission, severity at $1^{st}$ discharge from ICU (CPSCS, APACHE II, and GCS), and application of $FiO_2$ with oxygen therapy, implementation of sputum expectoration methods, and length of stay of ICU at $1^{st}$ ICU discharge were appeared to be significant; further, decision tree model analysis revealed that while only 4 variables (sputum expectoration methods, length of stay of ICU, $FiO_2$ with oxygen therapy at $1^{st}$ ICU discharge, and major symptoms at $1^{st}$ ICU admission) were shown to be significant. Conclusions : Since sputum expectoration method was the most important factor to predictor of unplanned ICU readmission, a assessment tool for the patients' capability of sputum expectoration needs to should be developed and implemented, and standardized ICU discharge criteria, including the factors identified from the by empirical evidences, might should be developed to decrease the unplanned ICU readmission rate.

Using Data Mining Techniques to Predict Win-Loss in Korean Professional Baseball Games (데이터마이닝을 활용한 한국프로야구 승패예측모형 수립에 관한 연구)

  • Oh, Younhak;Kim, Han;Yun, Jaesub;Lee, Jong-Seok
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.40 no.1
    • /
    • pp.8-17
    • /
    • 2014
  • In this research, we employed various data mining techniques to build predictive models for win-loss prediction in Korean professional baseball games. The historical data containing information about players and teams was obtained from the official materials that are provided by the KBO website. Using the collected raw data, we additionally prepared two more types of dataset, which are in ratio and binary format respectively. Dividing away-team's records by the records of the corresponding home-team generated the ratio dataset, while the binary dataset was obtained by comparing the record values. We applied seven classification techniques to three (raw, ratio, and binary) datasets. The employed data mining techniques are decision tree, random forest, logistic regression, neural network, support vector machine, linear discriminant analysis, and quadratic discriminant analysis. Among 21(= 3 datasets${\times}$7 techniques) prediction scenarios, the most accurate model was obtained from the random forest technique based on the binary dataset, which prediction accuracy was 84.14%. It was also observed that using the ratio and the binary dataset helped to build better prediction models than using the raw data. From the capability of variable selection in decision tree, random forest, and stepwise logistic regression, we found that annual salary, earned run, strikeout, pitcher's winning percentage, and four balls are important winning factors of a game. This research is distinct from existing studies in that we used three different types of data and various data mining techniques for win-loss prediction in Korean professional baseball games.

Analysis of Factors Influencing Obesity Treatment according to Initial Condition and Compliance with Medication (초기 조건과 복약 순응도에 따른 비만 치료 영향 인자 분석)

  • Han, Ji-Yeon;Park, Young-Jae
    • Journal of Korean Medicine for Obesity Research
    • /
    • v.19 no.1
    • /
    • pp.31-41
    • /
    • 2019
  • Objectives: The purpose of this study was to investigate the effects of gender, age, body weight, muscle mass, fat mass, body mass index (BMI), metabolism, and compliance with medication on weight loss in obese adults. Methods: We reviewed the medical records of 178 patients who were visited to the Korean Oriental Clinic for 3~6 month and had obesity treatment using Gamitaeumjowee-tang from April 2017 to May 2017. We conducted a paired T-test, correlation coefficient and decision tree to analyze factors influencing obesity treatment. Results: The results of correlation analysis showed that initial weight (kg), initial fat mass (kg), BMI ($kg/m^2$), compliance with medication (%), Original Harris-Benedict Equation, Revised Harris-Benedict Equation and The Mifflin St Jeor Equation was significantly correlated to weight loss (kg) (P<0.001). As a result of constructing the decision tree model, it showed that over 5% weight loss of their initial weight (n=154) was related with initial BMI ($kg/m^2$), compliance with medication (%) and initial muscle mass (kg). In case of over 5 kg weight loss of their initial weight (n=131), it was related with initial BMI ($kg/m^2$), compliance with medication (%) and final BMI ($kg/m^2$). Conclusions: This study suggests that weight loss may be affected by initial factors and that initial factors can be used for obesity treatment.

Linear interpolation and Machine Learning Methods for Gas Leakage Prediction Base on Multi-source Data Integration (다중소스 데이터 융합 기반의 가스 누출 예측을 위한 선형 보간 및 머신러닝 기법)

  • Dashdondov, Khongorzul;Jo, Kyuri;Kim, Mi-Hye
    • Journal of the Korea Convergence Society
    • /
    • v.13 no.3
    • /
    • pp.33-41
    • /
    • 2022
  • In this article, we proposed to predict natural gas (NG) leakage levels through feature selection based on a factor analysis (FA) of the integrating the Korean Meteorological Agency data and natural gas leakage data for considering complex factors. The paper has been divided into three modules. First, we filled missing data based on the linear interpolation method on the integrated data set, and selected essential features using FA with OrdinalEncoder (OE)-based normalization. The dataset is labeled by K-means clustering. The final module uses four algorithms, K-nearest neighbors (KNN), decision tree (DT), random forest (RF), Naive Bayes (NB), to predict gas leakage levels. The proposed method is evaluated by the accuracy, area under the ROC curve (AUC), and mean standard error (MSE). The test results indicate that the OrdinalEncoder-Factor analysis (OE-F)-based classification method has improved successfully. Moreover, OE-F-based KNN (OE-F-KNN) showed the best performance by giving 95.20% accuracy, an AUC of 96.13%, and an MSE of 0.031.

Agriculture Big Data Analysis System Based on Korean Market Information

  • Chuluunsaikhan, Tserenpurev;Song, Jin-Hyun;Yoo, Kwan-Hee;Rah, Hyung-Chul;Nasridinov, Aziz
    • Journal of Multimedia Information System
    • /
    • v.6 no.4
    • /
    • pp.217-224
    • /
    • 2019
  • As the world's population grows, how to maintain the food supply is becoming a bigger problem. Now and in the future, big data will play a major role in decision making in the agriculture industry. The challenge is how to obtain valuable information to help us make future decisions. Big data helps us to see history clearer, to obtain hidden values, and make the right decisions for the government and farmers. To contribute to solving this challenge, we developed the Agriculture Big Data Analysis System. The system consists of agricultural big data collection, big data analysis, and big data visualization. First, we collected structured data like price, climate, yield, etc., and unstructured data, such as news, blogs, TV programs, etc. Using the data that we collected, we implement prediction algorithms like ARIMA, Decision Tree, LDA, and LSTM to show the results in data visualizations.