• 제목/요약/키워드: Decision-tree technique

검색결과 203건 처리시간 0.032초

데이터 마이닝 기법을 이용한 피고용자의 근로환경 만족도 요인 분석 (Analysis of employee's satisfaction factor in working environment using data mining algorithm)

  • 이동열;김태호;이홍철
    • 대한안전경영과학회지
    • /
    • 제16권4호
    • /
    • pp.275-284
    • /
    • 2014
  • Decision Tree is one of analysis techniques which conducts grouping and prediction into several sub-groups from interested groups. Researcher can easily understand this progress and explain than other techniques. Because Decision Tree is easy technique to see results. This paper uses CART algorithm which is one of data mining technique. It used 273 variables and 70094 data(2010-2011) of working environment survey conducted by Korea Occupational Safety and Health Agency(KOSHA). And then refines this data, uses final 12 variables and 35447 data. To find satisfaction factor in working environment, this page has grouped employee to 3 types (under 30 age, 30 ~ 49age, over 50 age) and analyzed factor. Using CART algorithm, finds the best grouping variables in 155 data. It appeared that 'comfortable in organization' and 'proper reward' is the best grouping factor.

Xgboosting 기법을 이용한 실내 위치 측위 기법 (Indoor positioning system using Xgboosting)

  • 황치곤;윤창표;김대진
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국정보통신학회 2021년도 추계학술대회
    • /
    • pp.492-494
    • /
    • 2021
  • 기계학습에서 분류를 위한 기법으로 의사결정트리 기법을 이용한다. 그러나 의사결정트리는 과적합의 문제로 성능이 저하되는 문제가 있다. 이러한 문제를 해결하기 위해 여러 개의 부트스트랩을 생성하여 각 자료를 모델링하여 학습하는 Bagging기법, 샘플링한 데이터를 모델링하여 가중치를 조정하여 과적합을 감소시키는 Boosting과 같은 기법으로 이를 해결할 수 있다. 또한, 최근에 Xgboost 기법이 등장하였다. 이에 본 논문에서는 실내 측위를 위한 wifi 신호 데이터를 수집하여 기존 방식과 Xgboost에 적용하고, 이를 통한 성능평가를 수행한다.

  • PDF

Industrial Waste Database Analysis Using Data Mining Techniques

  • Cho, Kwang-Hyun;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • 제17권2호
    • /
    • pp.455-465
    • /
    • 2006
  • Data mining is the method to find useful information for large amounts of data in database. It is used to find hidden knowledge by massive data, unexpectedly pattern, and relation to new rule. The methods of data mining are decision tree, association rules, clustering, neural network and so on. We analyze industrial waste database using data mining technique. We use k-means algorithm for clustering and C5.0 algorithm for decision tree and Apriori algorithm for association rule. We can use these outputs for environmental preservation and environmental improvement.

  • PDF

Industrial Waste Database Analysis Using Data Mining

  • 조광현;박희창
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 한국데이터정보과학회 2006년도 PROCEEDINGS OF JOINT CONFERENCEOF KDISS AND KDAS
    • /
    • pp.241-251
    • /
    • 2006
  • Data mining is the method to find useful information for large amounts of data in database It is used to find hidden knowledge by massive data, unexpectedly pattern, relation to new rule. The methods of data mining are decision tree, association rules, clustering, neural network and so on. We analyze industrial waste database using data mining technique. We use k-means algorithm for clustering and C5.0 algorithm for decision tree and Apriori algorithm for association rule. We can use these analysis outputs for environmental preservation and environmental improvement.

  • PDF

초등학교 3학년 아동의 미디어기기 중독 영향요인 탐색에 관한 연구: 의사결정나무 분석법의 적용 (A Study on the Exploration of Factors Influencing Media Device Addiction in Third Grade Students: Application of Decision Tree Analysis Method)

  • 이경진;권연희;황아람
    • 한국보육지원학회지
    • /
    • 제18권5호
    • /
    • pp.79-99
    • /
    • 2022
  • Objective: This study was conducted to examine the significant factors affecting media device addiction using the data mining technique for large-scale data from the Panel Study on Korean Children Survey (PSKC). The PSKC data of this study were gathered from the elementary school students in their 10th survey (1,286 3rd grade students). Methods: The SPSS 21.0 program was used for data mining decision tree analysis, and the results are as follows. Results: First, the most important predictor of media device addiction was planning-organization which was among the sub-factors of executive function. Second, as a result of the decision tree analysis, the children with the highest probability of addiction to media devices were ones that had difficulties in planning and organizing, had mothers with a permissive parenting attitude felt difficulties in controlling behavior, and were alone at home for more than two hours a day without any adult supervision. Conclusion/Implications: The results of this study can help guide the direction of future research related to children's addiction to media devices by exploring and analyzing factors that significantly affect children's addiction to media devices.

건설업의 산업재해 특성분석을 위한 의사결정나무 기법의 상용 최적 알고리즘 선정 (Selection of an Optimal Algorithm among Decision Tree Techniques for Feature Analysis of Industrial Accidents in Construction Industries)

  • 임영문;최요한
    • 대한안전경영과학회지
    • /
    • 제7권5호
    • /
    • pp.1-8
    • /
    • 2005
  • The consequences of rapid industrial advancement, diversified types of business and unexpected industrial accidents have caused a lot of damage to many unspecified persons both in a human way and a material way Although various previous studies have been analyzed to prevent industrial accidents, these studies only provide managerial and educational policies using frequency analysis and comparative analysis based on data from past industrial accidents. The main objective of this study is to find an optimal algorithm for data analysis of industrial accidents and this paper provides a comparative analysis of 4 kinds of algorithms including CHAID, CART, C4.5, and QUEST. Decision tree algorithm is utilized to predict results using objective and quantified data as a typical technique of data mining. Enterprise Miner of SAS and AnswerTree of SPSS will be used to evaluate the validity of the results of the four algorithms. The sample for this work chosen from 19,574 data related to construction industries during three years ($2002\sim2004$) in Korea.

의사결정나무 분석법을 활용한 우울 노인의 특성 분석 (Analysis of the Characteristics of the Older Adults with Depression Using Data Mining Decision Tree Analysis)

  • 박명화;최소라;신아미;구철회
    • 대한간호학회지
    • /
    • 제43권1호
    • /
    • pp.1-10
    • /
    • 2013
  • Purpose: The purpose of this study was to develop a prediction model for the characteristics of older adults with depression using the decision tree method. Methods: A large dataset from the 2008 Korean Elderly Survey was used and data of 14,970 elderly people were analyzed. Target variable was depression and 53 input variables were general characteristics, family & social relationship, economic status, health status, health behavior, functional status, leisure & social activity, quality of life, and living environment. Data were analyzed by decision tree analysis, a data mining technique using SPSS Window 19.0 and Clementine 12.0 programs. Results: The decision trees were classified into five different rules to define the characteristics of older adults with depression. Classification & Regression Tree (C&RT) showed the best prediction with an accuracy of 80.81% among data mining models. Factors in the rules were life satisfaction, nutritional status, daily activity difficulty due to pain, functional limitation for basic or instrumental daily activities, number of chronic diseases and daily activity difficulty due to disease. Conclusion: The different rules classified by the decision tree model in this study should contribute as baseline data for discovering informative knowledge and developing interventions tailored to these individual characteristics.

Predictors of intentional intoxication using decision tree modeling analysis: a retrospective study

  • Oh, Eun Seok;Choi, Jae Hyung;Lee, Jung Won;Park, Su Yeon
    • Clinical and Experimental Emergency Medicine
    • /
    • 제5권4호
    • /
    • pp.230-239
    • /
    • 2018
  • Objective The suicide rate in South Korea is very high and is expected to increase in coming years. Intoxication is the most common suicide attempt method as well as one of the common reason for presenting to an emergency medical center. We used decision tree modeling analysis to identify predictors of risk for suicide by intentional intoxication. Methods A single-center, retrospective study was conducted at our hospital using a 4-year registry of the institute from January 1, 2013 to December 31, 2016. Demographic factors, such as sex, age, intentionality, therapeutic adherence, alcohol consumption, smoking status, physical disease, cancer, psychiatric disease, and toxicological factors, such as type of intoxicant and poisoning severity score were collected. Candidate risk factors based on the decision tree were used to select variables for multiple logistic regression analysis. Results In total, 4,023 patients with intoxication were enrolled as study participants, with 2,247 (55.9%) identified as cases of intentional intoxication. Reported annual percentages of intentional intoxication among patients were 628/937 (67.0%), 608/1,082 (56.2%), 536/1,017 (52.7), 475/987 (48.1%) from 2013 to 2016. Significant predictors identified based on decision tree analysis were alcohol consumption, old age, psychiatric disease, smoking, and male sex; those identified based on multiple regression analysis were alcohol consumption, smoking, male sex, psychiatric disease, old age, poor therapeutic adherence, and physical disease. Conclusion We identified important predictors of suicide risk by intentional intoxication. A specific and realistic approach to analysis using the decision tree modeling technique is an effective method to determine those groups at risk of suicide by intentional intoxication.

결정트리를 이용하는 불완전한 데이터 처리기법 (Incomplete data handling technique using decision trees)

  • 이종찬
    • 한국융합학회논문지
    • /
    • 제12권8호
    • /
    • pp.39-45
    • /
    • 2021
  • 본 논문은 손실값을 포함하는 불완전한 데이터를 처리하는 방법에 대해 논한다. 손실값을 최적으로 처리한다는 것은 학습 데이터가 가지고 있는 정보들에서 본래값과 가장 근사한 추정치를 구하고, 이 값으로 손실값을 대치하는 것이다. 이것을 실현하기 위한 방안으로 분류기가 정보를 분류하는 과정에서 완성되어가는 결정트리를 이용한다. 다시말해 이 결정트리는 전체 학습 데이터 중에서 손실값을 포함하지 않는 완전한 정보만을 C4.5 분류기에 입력하여 학습하는 과정에서 얻어진다. 이 결정트리의 노드들은 분류 변수의 정보를 가지는데, 루트에 가까운 상위 노드일수록 많은 정보를 포함하게 되고 말단 노드에서는 루트로부터의 경로를 통해 분류 영역을 형성하게 된다. 또한 각 영역에는 분류된 데이터 사건들의 평균이 기록된다. 손실값을 포함하는 사건들은 이러한 결정트리에 입력되어 각 노드의 정보에 따라 순회과정을 통해 사건과 가장 근접한 영역을 찾아가게 된다. 이 영역에 기록된 평균값을 손실값의 추정치로 간주하고, 보상 과정은 완성된다.

RFID의 효율적인 태그인식을 위한 Adaptive Decision 알고리즘 (Adaptive Decision Algorithm for an Improvement of RFID Anti-Collision)

  • 고영은;오경욱;방성일
    • 대한전자공학회논문지TC
    • /
    • 제44권4호
    • /
    • pp.1-9
    • /
    • 2007
  • 본 논문에서는 RFID Tag 충돌방지를 위한 Adaptive Decision 알고리즘에 대해 연구 하였다. 이를 위해 기존의 RFID Tag 충돌방지 기법인 ALOHA기반의 기법과 이진 검색 충돌방지 기반의 알고리즘을 먼저 비교?분석하였다. 기존 알고리즘은 태그를 인식하기 위한 탐색횟수와 전송하는 데이터량을 감소시키는데 한계점을 가지고 있었다. 제안한 Adaptive Decision 알고리즘은 인식범위 내의 태그를 구별하기 위해, 호출에 응답한 모든 태그의 ID 비트 별 '1'의 개수를 계산하고, 개수가 작은 그룹의 태그를 우선적으로 식별한다. 각 태그 ID 비트의 '1'의 개수는 리더의 메모리에 저장하고, 식별된 태그 ID 비트의 ‘1’의 개수를 감산한다. 이와 같은 과정을 반복함으로써 인식범위 내의 모든 태그를 식별한다. 논문에서 제안한 능동적인 태그 선택기준과 간단한 가감 과정을 통해 불필요한 탐색횟수를 줄 일 수 있다. 알고리즘의 성능평가는 태그를 인식하기 위한 리더의 반복횟수와 전송 데이터 량으로 나타내었다. 성능평가 결과, 기존의 알고리즘과 비교하여 Adaptive Decision 알고리즘의 반복횟수가 16.8% 감소되었고, 전송 데이터 량도 ¼배 감소된 것을 확인할 수 있었다.