• Title/Summary/Keyword: decision trees

Search Result 307, Processing Time 0.028 seconds

Analysis of Elementary Students' Smartphone Addiction Level by Demographic Features (인구통계학적 특성에 따른 초등학생의 스마트폰 중독 수준 분석)

  • Lee, Soojung
    • The Journal of Korean Association of Computer Education
    • /
    • v.17 no.6
    • /
    • pp.1-8
    • /
    • 2014
  • Recently, use of smartphones has increased so sharply at all ages that addiction problems have emerged. This study analysed factors, focusing on demographic variables, that impact on smartphone addiction of elementary students. First, differences between distributions of addicted groups and those between distributions of most frequently used smartphone functions per variable are analyzed. As a result, grade and academic achievements yield the biggest differences between distributions of addicted groups and gender, grade, and academic achievements yield differences between distributions of most frequently used smartphone functions. Also, differences between distributions of most frequently used smartphone functions per addicted user group are regarded significant. Furthermore, factors affecting smartphone addiction are analysed through the logistic regression analysis and decision trees, where grade, academic achievements, dual-income parents, and residential areas are found affecting in that order.

  • PDF

Minimizing the MOLAP/ROLAP Divide: You Can Have Your Performance and Scale It Too

  • Eavis, Todd;Taleb, Ahmad
    • Journal of Computing Science and Engineering
    • /
    • v.7 no.1
    • /
    • pp.1-20
    • /
    • 2013
  • Over the past generation, data warehousing and online analytical processing (OLAP) applications have become the cornerstone of contemporary decision support environments. Typically, OLAP servers are implemented on top of either proprietary array-based storage engines (MOLAP) or as extensions to conventional relational DBMSs (ROLAP). While MOLAP systems do indeed provide impressive performance on common analytics queries, they tend to have limited scalability. Conversely, ROLAP's table oriented model scales quite nicely, but offers mediocre performance at best relative to the MOLAP systems. In this paper, we describe a storage and indexing framework that aims to provide both MOLAP like performance and ROLAP like scalability by essentially combining some of the best features from both. Based upon a combination of R-trees and bitmap indexes, the storage engine has been integrated with a robust OLAP query engine prototype that is able to fully exploit the efficiency of the proposed storage model. Specifically, it utilizes an OLAP algebra coupled with a domain specific query optimizer, to map user queries directly to the storage and indexing framework. Experimental results demonstrate that not only does the design improve upon more naive approaches, but that it does indeed offer the potential to optimize both query performance and scalability.

IMPROVEMENT OF THE LOCA PSA MODEL USING A BEST-ESTIMATE THERMAL-HYDRAULIC ANALYSIS

  • Lee, Dong Hyun;Lim, Ho-Gon;Yoon, Han Young;Jeong, Jae Jun
    • Nuclear Engineering and Technology
    • /
    • v.46 no.4
    • /
    • pp.541-546
    • /
    • 2014
  • Probabilistic Safety Assessment (PSA) has been widely used to estimate the overall safety of nuclear power plants (NPP) and it provides base information for risk informed application (RIA) and risk informed regulation (RIR). For the effective and correct use of PSA in RIA/RIR related decision making, the risk estimated by a PSA model should be as realistic as possible. In this work, a best-estimate thermal-hydraulic analysis of loss-of-coolant accidents (LOCAs) for the Hanul Nuclear Units 3&4 is first carried out in a systematic way. That is, the behaviors of peak cladding temperature (PCT) were analyzed with various combinations of break sizes, the operating conditions of safety systems, and the operator's action time for aggressive secondary cooling. Thereafter, the results of the thermal-hydraulic analysis have been reflected in the improvement of the PSA model by changing both accident sequences and success criteria of the event trees for the LOCA scenarios.

A Feature Analysis of Industrial Accidents Using C4.5 Algorithm (C4.5 알고리즘을 이용한 산업 재해의 특성 분석)

  • Leem, Young-Moon;Kwag, Jun-Koo;Hwang, Young-Seob
    • Journal of the Korean Society of Safety
    • /
    • v.20 no.4 s.72
    • /
    • pp.130-137
    • /
    • 2005
  • Decision tree algorithm is one of the data mining techniques, which conducts grouping or prediction into several sub-groups from interested groups. This technique can analyze a feature of type on groups and can be used to detect differences in the type of industrial accidents. This paper uses C4.5 algorithm for the feature analysis. The data set consists of 24,887 features through data selection from total data of 25,159 taken from 2 year observation of industrial accidents in Korea For the purpose of this paper, one target value and eight independent variables are detailed by type of industrial accidents. There are 222 total tree nodes and 151 leaf nodes after grouping. This paper Provides an acceptable level of accuracy(%) and error rate(%) in order to measure tree accuracy about created trees. The objective of this paper is to analyze the efficiency of the C4.5 algorithm to classify types of industrial accidents data and thereby identify potential weak points in disaster risk grouping.

Korean Transition-based Dependency Parsing with Recurrent Neural Network (순환 신경망을 이용한 전이 기반 한국어 의존 구문 분석)

  • Li, Jianri;Lee, Jong-Hyeok
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.8
    • /
    • pp.567-571
    • /
    • 2015
  • Transition-based dependency parsing requires much time and efforts to design and select features from a very large number of possible combinations. Recent studies have successfully applied Multi-Layer Perceptrons (MLP) to find solutions to this problem and to reduce the data sparseness. However, most of these methods have adopted greedy search and can only consider a limited amount of information from the context window. In this study, we use a Recurrent Neural Network to handle long dependencies between sub dependency trees of current state and current transition action. The results indicate that our method provided a higher accuracy (UAS) than an MLP based model.

Development of the Road Weather Detection Algorithm on CCTV Video Images using Double Decision Trees (이중결정트리를 이용한 CCTV영상에서의 도로 날씨정보검출알고리즘 개발)

  • Park, Beung-Raul;NamKoong, Sung;Lim, Joong-Tae
    • The KIPS Transactions:PartB
    • /
    • v.14B no.6
    • /
    • pp.445-452
    • /
    • 2007
  • We proposed a detection scheme of weather information in CCTV video images in this paper. The scheme obtains the RGB distribution of shiny day and divide a target image into cloud, rain, snow and for RGB distributions. shiny day RGB distribution. Our scheme designed systematically to detection and separation special characteristics of images from complex weather information. Our algorithm has less overhead than the previous methods to use weather database DB at the view of time and space. And our algorithm can be use in real world system with low cost of implementation. Also, our algorithm use informations of temperature, humidity, date, and time to detect the information of weather with high quality.

Decision Support System fur Arrival/Departure of Ships in Port by using Enhanced Genetic Programming (개선된 유전적 프로그래밍 기법을 이용한 선박 입출항 의사결정 지원 시스템)

  • Lee, K. H.;Rhee, W.
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2001.06a
    • /
    • pp.383-389
    • /
    • 2001
  • 된 연구에서 대상으로 하고 있는 LG 정유 광양항 제품부두는 7 선석(Berth)에 재화중량(DWT) 300톤에서 48000 톤의 선박까지 다양한 선박이 이용하고 있으며, 해상의 기상상태에 따른 선박 입출향 통제 지침 설정이 어렵고, 현재 사용하고 있는 지침의 근거가 명확하지 않아 현재의 부두 운영이 비효율적이거나 안전성이 결여되어 있다고 할 수 있다. 따라서 이를 개선하기 위한 합리적인 부두운영 제한조건 개발이 절실히 요구되었다. 본 논문에서는 대상 부두의 특성, 대상 선박의 특성, 하중상태, 선박 운항자의 특성 등을 고려하여 해상/기상 상황(바람, 조류 및 파랑)에 따른 부두 입출항 가능 여부를 정량적으로 판단하고, 안전성 향상 방안을 제시할 수 있는 의사결정 시스템을 개발하고 5번, 7번 선석을 대상으로 이를 검증하였다. 여기서는 입출항 여부를 정량적으로 판단하여 결과를 제시하기 위해서 유전적 프로그래밍(Genetic Programming)을 이용한 기계학습 방법을 이용하였으며, GP의 방대한 계산량을 줄이기 위한 가중 선형 연상 기억(Weighted Linear Associative Memory: WLAM) 방법의 도입 및 전역 최적점을 쉽게 찾기 위한 Group of Additive Genetic Programming Trees(GAGPT)를 도입함으로써 학습 성능을 개선하였다.

  • PDF

A comparative study of machine learning methods for automated identification of radioisotopes using NaI gamma-ray spectra

  • Galib, S.M.;Bhowmik, P.K.;Avachat, A.V.;Lee, H.K.
    • Nuclear Engineering and Technology
    • /
    • v.53 no.12
    • /
    • pp.4072-4079
    • /
    • 2021
  • This article presents a study on the state-of-the-art methods for automated radioactive material detection and identification, using gamma-ray spectra and modern machine learning methods. The recent developments inspired this in deep learning algorithms, and the proposed method provided better performance than the current state-of-the-art models. Machine learning models such as: fully connected, recurrent, convolutional, and gradient boosted decision trees, are applied under a wide variety of testing conditions, and their advantage and disadvantage are discussed. Furthermore, a hybrid model is developed by combining the fully-connected and convolutional neural network, which shows the best performance among the different machine learning models. These improvements are represented by the model's test performance metric (i.e., F1 score) of 93.33% with an improvement of 2%-12% than the state-of-the-art model at various conditions. The experimental results show that fusion of classical neural networks and modern deep learning architecture is a suitable choice for interpreting gamma spectra data where real-time and remote detection is necessary.

Development and Comparison of Data Mining-based Prediction Models of Building Fire Probability

  • Hong, Sung-gwan;Jeong, Seung Ryul
    • Journal of Internet Computing and Services
    • /
    • v.19 no.6
    • /
    • pp.101-112
    • /
    • 2018
  • A lot of manpower and budgets are being used to prevent fires, and only a small portion of the data generated during this process is used for disaster prevention activities. This study develops a prediction model of fire occurrence probability based on data mining in order to more actively use these data for disaster prevention activities. For this purpose, variables for predicting fire occurrence probability of various buildings were selected and data of construction administrative system, national fire information system, and Korea Fire Insurance Association were collected and integrated data set was constructed. After appropriate data cleansing and preprocessing, various data mining methodologies such as artificial neural network, decision trees, SVM, and Naive Bayesian were used to develop a prediction model of the fire occurrence probability of buildings. The most accurate model among the derived models is Linear SVM model which shows 68.42% as experimental data and 63.54% as verification data and it is the best model to predict fire occurrence probability of buildings. As this study develops the prediction model which uses only the set values of the specific ranges, future studies may explore more opportunites to use various setting values not shown in this study.

Correlated variable importance for random forests (랜덤포레스트를 위한 상관예측변수 중요도)

  • Shin, Seung Beom;Cho, Hyung Jun
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.2
    • /
    • pp.177-190
    • /
    • 2021
  • Random forests is a popular method that improves the instability and accuracy of decision trees by ensembles. In contrast to increasing the accuracy, the ease of interpretation is sacrificed; hence, to compensate for this, variable importance is provided. The variable importance indicates which variable plays a role more importantly in constructing the random forests. However, when a predictor is correlated with other predictors, the variable importance of the existing importance algorithm may be distorted. The downward bias of correlated predictors may reduce the importance of truly important predictors. We propose a new algorithm remedying the downward bias of correlated predictors. The performance of the proposed algorithm is demonstrated by the simulated data and illustrated by the real data.