• Title/Summary/Keyword: decision tree and system analysis

Search Result 217, Processing Time 0.034 seconds

An Empirical Study of Profiling Model for the SMEs with High Demand for Standards Using Data Mining (데이터마이닝을 이용한 표준정책 수요 중소기업의 프로파일링 연구: R&D 동기와 사업화 지원 정책을 중심으로)

  • Jun, Seung-pyo;Jung, JaeOong;Choi, San
    • Journal of Korea Technology Innovation Society
    • /
    • v.19 no.3
    • /
    • pp.511-544
    • /
    • 2016
  • Standards boost technological innovation by promoting information sharing, compatibility, stability and quality. Identifying groups of companies that particularly benefit from these functions of standards in their technological innovation and commercialization helps to customize planning and implementation of standards-related policies for demand groups. For this purpose, this study engages in profiling of SMEs whose R&D objective is to respond to standards as well as those who need to implement standards system for technological commercialization. Then it suggests a prediction model that can distinguish such companies from others. To this end, decision tree analysis is conducted for profiling of characteristics of subject SMEs through data mining. Subject SMEs include (1) those that engage in R&D to respond to standards (Group1) or (2) those in need of product standard or technological certification policies for commercialization purposes (Group 2). Then the study proposes a prediction model that can distinguish Groups 1 and 2 from others based on several variables by adopting discriminant analysis. The practicality of discriminant formula is statistically verified. The study suggests that Group 1 companies are distinguished in variables such as time spent on R&D planning, KoreanStandardIndustryClassification (KSIC) category, number of employees and novelty of technologies. Profiling result of Group 2 companies suggests that they are differentiated in variables such as KSIC category, major clients of the companies, time spent on R&D and ability to test and verify their technologies. The prediction model proposed herein is designed based on the outcomes of profiling and discriminant analysis. Its purpose is to serve in the planning or implementation processes of standards-related policies through providing objective information on companies in need of relevant support and thereby to enhance overall success rate of standards-related projects.

A Study on Analyzing Children's Crossing Behaviors on Non-signalized Crosswalk (비신호 횡단보도에서의 어린이 횡단행태 분석 연구)

  • Lee, Deok Whan;Lee, Yun Suk;Kim, Won Ho;Lee, Back Jin
    • Journal of Korean Society of Transportation
    • /
    • v.31 no.3
    • /
    • pp.19-32
    • /
    • 2013
  • The study aims to find the characteristics of children's crossing behavior on crosswalk in school zones. It considers accident occurrence and physical form of school zones. Seven elementary school zones were investigated. Using data collected by field observation and video recording, statistical analysis, CHAID algorithm analysis, and pattern analysis were performed. As a result, it was found that children's waiting, attention and distraction were related to the accident occurrence. While 69.1% children showed waiting-before-crossing behavior in low-accident occurrence crosswalk, 83.6% children showed non waiting-before-crossing behavior in high-accident occurrence crosswalk. Moreover, the ratio of waiting, attention behavior was found to be higher when the width of the crosswalk was wide and the distance from the school's entrance to the crosswalk was long. These research findings showed that children's behavior-oriented approach was required to improve safety in school zone.

Building battery deterioration prediction model using real field data (머신러닝 기법을 이용한 납축전지 열화 예측 모델 개발)

  • Choi, Keunho;Kim, Gunwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.243-264
    • /
    • 2018
  • Although the worldwide battery market is recently spurring the development of lithium secondary battery, lead acid batteries (rechargeable batteries) which have good-performance and can be reused are consumed in a wide range of industry fields. However, lead-acid batteries have a serious problem in that deterioration of a battery makes progress quickly in the presence of that degradation of only one cell among several cells which is packed in a battery begins. To overcome this problem, previous researches have attempted to identify the mechanism of deterioration of a battery in many ways. However, most of previous researches have used data obtained in a laboratory to analyze the mechanism of deterioration of a battery but not used data obtained in a real world. The usage of real data can increase the feasibility and the applicability of the findings of a research. Therefore, this study aims to develop a model which predicts the battery deterioration using data obtained in real world. To this end, we collected data which presents change of battery state by attaching sensors enabling to monitor the battery condition in real time to dozens of golf carts operated in the real golf field. As a result, total 16,883 samples were obtained. And then, we developed a model which predicts a precursor phenomenon representing deterioration of a battery by analyzing the data collected from the sensors using machine learning techniques. As initial independent variables, we used 1) inbound time of a cart, 2) outbound time of a cart, 3) duration(from outbound time to charge time), 4) charge amount, 5) used amount, 6) charge efficiency, 7) lowest temperature of battery cell 1 to 6, 8) lowest voltage of battery cell 1 to 6, 9) highest voltage of battery cell 1 to 6, 10) voltage of battery cell 1 to 6 at the beginning of operation, 11) voltage of battery cell 1 to 6 at the end of charge, 12) used amount of battery cell 1 to 6 during operation, 13) used amount of battery during operation(Max-Min), 14) duration of battery use, and 15) highest current during operation. Since the values of the independent variables, lowest temperature of battery cell 1 to 6, lowest voltage of battery cell 1 to 6, highest voltage of battery cell 1 to 6, voltage of battery cell 1 to 6 at the beginning of operation, voltage of battery cell 1 to 6 at the end of charge, and used amount of battery cell 1 to 6 during operation are similar to that of each battery cell, we conducted principal component analysis using verimax orthogonal rotation in order to mitigate the multiple collinearity problem. According to the results, we made new variables by averaging the values of independent variables clustered together, and used them as final independent variables instead of origin variables, thereby reducing the dimension. We used decision tree, logistic regression, Bayesian network as algorithms for building prediction models. And also, we built prediction models using the bagging of each of them, the boosting of each of them, and RandomForest. Experimental results show that the prediction model using the bagging of decision tree yields the best accuracy of 89.3923%. This study has some limitations in that the additional variables which affect the deterioration of battery such as weather (temperature, humidity) and driving habits, did not considered, therefore, we would like to consider the them in the future research. However, the battery deterioration prediction model proposed in the present study is expected to enable effective and efficient management of battery used in the real filed by dramatically and to reduce the cost caused by not detecting battery deterioration accordingly.

Research on Change of Heart Rate Variability and Psychological Scale by Sasang Constitution according to before and after of the Meditation Programs (α version) (명상프로그램(α version) 시행 전후의 사상체질별 심리척도 및 HRV 변화 연구)

  • Kim, Geun-Woo;Bae, Hyo-Sang;Son, Han-Bum;Lee, Pil-Won;Kim, Byoung-Soo;Park, Seong-Sik
    • Journal of Oriental Neuropsychiatry
    • /
    • v.25 no.1
    • /
    • pp.1-12
    • /
    • 2014
  • Objectives: In this study, the meditation programs (${\alpha}$ version), which are properly coordinated according to the motion, breathing, and relaxation, are evaluated and researched upon to have positive effects on stress and in the area of psychology. Methods: Approved by the Clinical Trials Deliberation Committee in Oriental Medicine, Dongguk University, Ilsan Hospital, this study collected data according to the applicant's consents, demographic information and anthropometry for the Sasang Constitutional diagnosis. Sasang Constitutional diagnosis measured the beta tools by Institute of Oriental Medicine and a decision tree was made for the Sasang Constitutional questionnaires. The STAI, STAXI, BDI, and HRV were measured before and after the meditation in order to compare the effects of meditation according to Sasang Constitution. The HRV was used as a ProComP KM Tech (co). Results: 1) The positive changes available in the Time-domain analysis of heart rate variability assessment showed that the peace of mind is increased. By analyzing the Sasang constitution, So-eum In's peace of mind included a physical stability of the autonomic nervous system. 2) According to the psychological scale evaluation, each depression scale, trait anger, anger-in, state anxiety and trait anxiety index proved significantly positive effects. By analyzing the Sasang constitution, Eun-In which involved So-eum In and Tae-eum In, had positive effects. 3) The psychological scale changed the group of diagnosed depression or anxiety, it did not mean that the psychological scale changes in the depression group, but the index of the anxiety group had been significantly reduced. This program had clinical effects for anxious patients and Eum-In which involved Tae-eum In and So-eum In according to the analysis of Sasang constitution. 4) Correlations between the gender of each psychological scale showed that women have overall low correlations, but, there were no significant changes. Conclusions: The meditation program developed by adequately mixing Action, relaxation and breathing shows that it is effective for overall Eum-in physical and mental relaxation and concentration. In the future, It will have to be developed Meditation program to show the same effect for all people.

Forecasting of Customer's Purchasing Intention Using Support Vector Machine (Support Vector Machine 기법을 이용한 고객의 구매의도 예측)

  • Kim, Jin-Hwa;Nam, Ki-Chan;Lee, Sang-Jong
    • Information Systems Review
    • /
    • v.10 no.2
    • /
    • pp.137-158
    • /
    • 2008
  • Rapid development of various information technologies creates new opportunities in online and offline markets. In this changing market environment, customers have various demands on new products and services. Therefore, their power and influence on the markets grow stronger each year. Companies have paid great attention to customer relationship management. Especially, personalized product recommendation systems, which recommend products and services based on customer's private information or purchasing behaviors in stores, is an important asset to most companies. CRM is one of the important business processes where reliable information is mined from customer database. Data mining techniques such as artificial intelligence are popular tools used to extract useful information and knowledge from these customer databases. In this research, we propose a recommendation system that predicts customer's purchase intention. Then, customer's purchasing intention of specific product is predicted by using data mining techniques using receipt data set. The performance of this suggested method is compared with that of other data mining technologies.

Changes and determinants affecting on geographic variations in health behavior, prevalence of hypertension and diabetes in Korean (지역사회 건강행태, 고혈압, 당뇨병 유병률 변화와 변이 요인)

  • Kim, Yoo-Mi;Kang, Sung-Hong
    • Journal of Digital Convergence
    • /
    • v.13 no.11
    • /
    • pp.241-254
    • /
    • 2015
  • This study examined changes in health behavior and prevalence of hypertension and diabetes during five years and analyzed determinants affecting on geographic variations of them. Data from Korean Community Health Survey in the period of 2008 and 2013 with 246 small districts were analyzed. Data were analyzed using convergence tools such as geographic information system tool and decision tree. During the five years period, areas of the increases in smoking and drinking were southwest regions showed increased smoking and areas of increases in physical activity are western regions. Areas of the increases in the prevalence of hypertension were west and south regions and in the prevalence of diabetes were east and north regions. Determinants affecting on regional variations in the prevalence of hypertension and diabetes were drinking, physical activity, obesity, arthritis, depressive symptom and stress. Mental health program should be developed for non-communicable disease. Thus, to decrease the prevalence of hypertension and diabetes, our study emphasized the necessity to develop customized mental health policies according to the region-specific characteristics.

The big data method for flash flood warning (돌발홍수 예보를 위한 빅데이터 분석방법)

  • Park, Dain;Yoon, Sanghoo
    • Journal of Digital Convergence
    • /
    • v.15 no.11
    • /
    • pp.245-250
    • /
    • 2017
  • Flash floods is defined as the flooding of intense rainfall over a relatively small area that flows through river and valley rapidly in short time with no advance warning. So that it can cause damage property and casuality. This study is to establish the flash-flood warning system using 38 accident data, reported from the National Disaster Information Center and Land Surface Model(TOPLATS) between 2009 and 2012. Three variables were used in the Land Surface Model: precipitation, soil moisture, and surface runoff. The three variables of 6 hours preceding flash flood were reduced to 3 factors through factor analysis. Decision tree, random forest, Naive Bayes, Support Vector Machine, and logistic regression model are considered as big data methods. The prediction performance was evaluated by comparison of Accuracy, Kappa, TP Rate, FP Rate and F-Measure. The best method was suggested based on reproducibility evaluation at the each points of flash flood occurrence and predicted count versus actual count using 4 years data.

Performance Comparison of Machine Learning based Prediction Models for University Students Dropout (머신러닝 기반 대학생 중도 탈락 예측 모델의 성능 비교)

  • Seok-Bong Jeong;Du-Yon Kim
    • Journal of the Korea Society for Simulation
    • /
    • v.32 no.4
    • /
    • pp.19-26
    • /
    • 2023
  • The increase in the dropout rate of college students nationwide has a serious negative impact on universities and society as well as individual students. In order to proactive identify students at risk of dropout, this study built a decision tree, random forest, logistic regression, and deep learning-based dropout prediction model using academic data that can be easily obtained from each university's academic management system. Their performances were subsequently analyzed and compared. The analysis revealed that while the logistic regression-based prediction model exhibited the highest recall rate, its f-1 value and ROC-AUC (Receiver Operating Characteristic - Area Under the Curve) value were comparatively lower. On the other hand, the random forest-based prediction model demonstrated superior performance across all other metrics except recall value. In addition, in order to assess model performance over distinct prediction periods, we divided these periods into short-term (within one semester), medium-term (within two semesters), and long-term (within three semesters). The results underscored that the long-term prediction yielded the highest predictive efficacy. Through this study, each university is expected to be able to identify students who are expected to be dropped out early, reduce the dropout rate through intensive management, and further contribute to the stabilization of university finances.

A Hybrid Under-sampling Approach for Better Bankruptcy Prediction (부도예측 개선을 위한 하이브리드 언더샘플링 접근법)

  • Kim, Taehoon;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.173-190
    • /
    • 2015
  • The purpose of this study is to improve bankruptcy prediction models by using a novel hybrid under-sampling approach. Most prior studies have tried to enhance the accuracy of bankruptcy prediction models by improving the classification methods involved. In contrast, we focus on appropriate data preprocessing as a means of enhancing accuracy. In particular, we aim to develop an effective sampling approach for bankruptcy prediction, since most prediction models suffer from class imbalance problems. The approach proposed in this study is a hybrid under-sampling method that combines the k-Reverse Nearest Neighbor (k-RNN) and one-class support vector machine (OCSVM) approaches. k-RNN can effectively eliminate outliers, while OCSVM contributes to the selection of informative training samples from majority class data. To validate our proposed approach, we have applied it to data from H Bank's non-external auditing companies in Korea, and compared the performances of the classifiers with the proposed under-sampling and random sampling data. The empirical results show that the proposed under-sampling approach generally improves the accuracy of classifiers, such as logistic regression, discriminant analysis, decision tree, and support vector machines. They also show that the proposed under-sampling approach reduces the risk of false negative errors, which lead to higher misclassification costs.

A Study on Strategy for success of tourism e-marketplace (관광 e-마켓플레이스의 성공전략에 관한 연구)

  • Hong, Ji-Whan;Kim, Keun-Hyung
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2006.11a
    • /
    • pp.333-336
    • /
    • 2006
  • E-marketplace is a kind of B2B e-Business system that supports business transactions among companies. If e-marketplace is revitalized, we expect not only the development of related industry but also decrease of transaction cost among companies. It is necessary for the introduction and revitalization of e-marketplace in tourist industry from this point of view. Participants of tour e-marketplace are tour-related companies(travel agencies, lodging enterprises, shipping enterprises, etc.). Also tourists want to search a variety of tour products or contents. So tour e-marketplace has characteristics of B2C e-Business systems as well as B2B e-Business systems at once. The purpose of this study is to classify success factors that determine characteristics of tour e-marketplace through statistics survey from e-marketplace factors related tourism websites. First of all, we analyze success factors of B2B and B2C e-marketplace. Then we will set up influence factors of tour e-marketplace and conduct a survey about success factors of tour e-marketplace. Therefore, we could expect to find these good attributes in tour e-marketplace success through logistic regression and decision tree analysis from source data.

  • PDF