• Title/Summary/Keyword: 의사결정나무알고리즘

Search Result 106, Processing Time 0.034 seconds

Analyzing vocational outcomes of people with hearing impairments : A data mining approach (청각장애인의 취업결정요인 분석 연구 -데이터마이닝 기법(Exhaustive CHAID)의 적용)

  • Shin, Hyun-Uk
    • Journal of Digital Convergence
    • /
    • v.13 no.11
    • /
    • pp.449-459
    • /
    • 2015
  • The purpose of this study was to examine demographic, human capital and service factors affecting employment outcomes of people with hearing impairments. The total of 422 individuals (age from 20 years to 65 years) with hearing impairments were collected from the Panel Survey of Employment for the Disabled from Korea Employment Agency for the Disabled. The dependent variable is employment outcomes. The predictor variables include a set of personal history, human capital and rehabilitation service variables. The chi-squared automatic interaction detector (CHAID) analysis revealed that the status of the national basic livelihood security played a determining role in predicting the employment of people with hearing impairments. Also, it was found that the three factors of the status on the national basic livelihood security, needed help about activities of dailey living, licenses & employment service factors created bigger synergy effect when they inter-complemented one another.

Development of Land Compensation Cost Estimation Model : The Use of the Construction CALS Data and Linked Open Data (토지 보상비 추정 모델 개발 - 건설CALS데이터와 공공데이터 중심으로)

  • Lee, Sang-Gyu;Kim, Jin-Wook;Seo, Myeong-Bae
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2020.07a
    • /
    • pp.375-378
    • /
    • 2020
  • 본 연구는 토지 보상비의 추정 모델 개발을 위해서 건설 CALS (Continuous Acquisition & Life-cycle Support) 시스템의 내부데이터와 개별공시지가 및 표준지 공시지가 등의 외부데이터, 그리고 개발된 추정 모델의 고도화를 위한 개별공시가 데이터를 기반으로 생성된 데이터를 활용하였다. 이렇게 수집된 3가지 유형의 데이터를 분석하기 위해서 기존 선형 모델 또는 의사결정나무 (Tree) 기반의 모델상 과적합 오류를 제거할 경우 매우 유용한 알고리즘으로 Decision Tree 기반의 Xgboost 알고리즘을 데이터 분석 방법론으로 토지 보상비 추정 모델 개발에 활용하였다. Xgboost 알고리즘의 고도화를 위해 하이퍼파라미터 튜닝을 적용한 결과, 실제 보상비와 개발된 보상비 추정 모델의 MAPE(Mean Absolute Percentage Error) 범위는 19.5%로 확인하였다.

  • PDF

Effective Diagnostic Method Of Breast Cancer Data Using Decision Tree (Decision Tree를 이용한 효과적인 유방암 진단)

  • Jung, Yong-Gyu;Lee, Seung-Ho;Sung, Ho-Joong
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.10 no.5
    • /
    • pp.57-62
    • /
    • 2010
  • Recently, decision tree techniques have been studied in terms of quick searching and extracting of massive data in medical fields. Although many different techniques have been developed such as CART, C4.5 and CHAID which are belong to a pie in Clermont decision tree classification algorithm, those methods can jeopardize remained data by the binary method during procedures. In brief, C4.5 method composes a decision tree by entropy levels. In contrast, CART method does by entropy matrix in categorical or continuous data. Therefore, we compared C4.5 and CART methods which were belong to a same pie using breast cancer data to evaluate their performance respectively. To convince data accuracy, we performed cross-validation of results in this paper.

Development of a Game Content Based on Metaverse Providing Decision Tree Algorithm Education for Middle School Students (중학생을 위한 의사결정나무 알고리즘 교육을 제공하는 메타버스 기반 게임 콘텐츠 개발)

  • Hyun, Subin;Kim, Yujin;Park, Chan Jung
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.4
    • /
    • pp.106-117
    • /
    • 2022
  • In 2021, AI basics were introduced in the high school curriculum. There are many worries that the problem of utilization-oriented education will be repeated with the introduction of artificial intelligence education rather than the principles that occurred when ICT was applied to education in the past. Most of the existing AI education platforms focus only on the use of AI. For artificial intelligence education of middle school students, there are difficulties in learning about the process by which artificial intelligence derives results and learning the principles of artificial intelligence algorithms. Recently, as the educational application of metaverse has become a hot topic, research has been started to improve learning achievement by arousing students' immersion and interest. This research developed educational game contents about decision tree algorithm using metaverse as educational contents that can be used in middle school AI education. By applying games to education, it was intended to increase students' interest and immersion in artificial intelligence, and to increase educational effectiveness. In this paper, the educational effectiveness, difficulty, and level of interest were analyzed for pre-service teachers regarding the developed game content. Based on this, a future principle-oriented artificial intelligence education method was suggested.

Extraction of Blood Velocity Using FCM and Fuzzy Decision Trees in Doppler Ultrasound Images of Brachial Artery (상완동맥 색조 도플러 초음파 영상에서 FCM과 퍼지 의사 결정 트리를 이용한 혈류 속도 추출)

  • Kim, Kwang Baek;Jung, Young Jin;Nam, Youn Man;Lee, Jae Yeol
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2019.07a
    • /
    • pp.19-22
    • /
    • 2019
  • 상완동맥은 어깨에서부터 팔꿈치까지 내려오는 상완골의 내측부에 존재하며 혈압을 측정할 때 사용되는 혈관이다. 이 혈관은 골절로 인해 찢어지거나, 또는 혈액순환에 문제가 생겨 혈관이 막히는 경우가 발생한다. 이러한 경우 혈관의 상태를 확인하기 위하여 색조 도플러 초음파 검사를 사용하지만, 사용자에 따라 영상을 통한 판단 기준이 다르다는 문제점이 발생한다. 따라서 본 논문에서는 FCM과 Fuzzy Decision Tree를 이용한 영상 처리를 통해 일관성 있는 판단기준을 세우기 위한 혈류의 속도를 제안한다. 색조 도플러 초음파 영상에서의 상완 동맥을 추출하여 기울기를 이용한 FCM 알고리즘을 통해 소속도를 추출한 뒤 퍼지 룰에 적용하여 의사 결정 트리로 등급을 분류하고 결과적으로 혈류 속도를 추출한다. 색조 도플러 초음파 영상에서 환자의 개인 정보를 보호하기 위해 개인 정보 영역을 제거하여 ROI 영역을 추출하고 ROI 영역을 이진화를 통하여 상완동맥이 있는 영역을 추출한다. 이진화 된 ROI 영역에서 혈관 영상의 혈류 방향으로의 무게중심을 설정하고 각각의 픽셀과 무게중심 선과의 거리를 이용하여 소속도를 추출한 후 FCM을 사용하여 최적의 기울기를 선정한다. FCM을 통해 추출한 최종 소속도를 이용하여 퍼지 룰에 적용한 뒤 계산된 T-norm과 소속도의 분산을 이용하여 의사 결정 트리를 형성 트리의 단말 노드들은 각 픽셀을 분류한다. 분류되어진 데이터들의 노드별 소속도 평균을 구한 뒤 디퍼지화를 통해 COG(Center of Gravity)를 계산한다. 마지막으로 그 값을 이용하여 혈류 속도에 영향을 미치는 정도를 계산한 뒤 최종 혈류의 속도를 제안한다.

  • PDF

The evaluation of Distributed Data Mining System using USA census Database (미국 인구통계 데이터를 이용한 분산형 데이터마이닝 시스템 성능평가)

  • Kim, Choong-Gon;Woo, Jung-Geun;Kim, Sung-Guk;Baik, Sung-Wook
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2007.10c
    • /
    • pp.191-194
    • /
    • 2007
  • 본 논문에서는 분산형 환경에 적합한 새로운 의사결정나무 알고리즘을 제안하고 그 실용성을 확인하기 위해 분산형 데이터마이닝 시스템을 구현하였다. 그리고 본 논문에서 구현한 시스템을 평가하기 위해 데이터의 신뢰성이 높은 방대한 양의 미국의 인구통계 데이터(Census bureau database)를 사용하였다. 본 논문에서 구현한 시스템을 이용하여 신뢰성을 테스트하였고 그 결과가 다른 시스템의 알고리즘과 유사한 신뢰성을 나타내었다.

  • PDF

Smart Farm Expert System for Paprika using Decision Tree Technique (의사결정트리 기법을 이용한 파프리카용 스마트팜 전문가 시스템)

  • Jeong, Hye-sun;Lee, In-yong;Lim, Joong-seon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2018.10a
    • /
    • pp.373-376
    • /
    • 2018
  • Traditional paprika smart farm systems are often harmful to paprika growth because they are set to follow the values of several sensors to the reference value, so the system is often unable to make optimal judgement. Using decision tree techniques, the expert system for the paprika smart farm is designed to create a control system with a decision-making structure similar to that of farmers using data generated by factors that depend on their surroundings. With the current smart farm control system, it is essential for farmers to intervene in the surrounding environment because it is designed to follow sensor values to the reference values set by the farmer. To solve this problem even slightly, it is going to obtain environmental data and design controllers that apply decision tree method. The expert system is established for complex control by selecting the most influential environmental factors before controlling the paprika smart farm equipment, including criteria for selecting decisions by farmers. The study predicts that each environmental element will be a standard when creating smart farms for professionals because of the interrelationships of data, and more surrounding environmental factors affecting growth.

  • PDF

Developing data quality management algorithm for Hypertension Patients accompanied with Diabetes Mellitus By Data Mining (데이터 마이닝을 이용한 고혈압환자의 당뇨질환 동반에 관한 데이터 질 관리 알고리즘 개발)

  • Hwang, Kyu-Yeon;Lee, Eun-Sook;Kim, Go-Won;Hong, Sung-Ok;Park, Jong-Son;Kwak, Mi-Sook;Lee, Ye-Jin;Im, Chae-Hyuk;Park, Tae-Hyun;Park, Jong-Ho;Kang, Sung-Hong
    • Journal of Digital Convergence
    • /
    • v.14 no.7
    • /
    • pp.309-319
    • /
    • 2016
  • There is a need to develop a data quality management algorithm in order to improve the quality of health care data. In this study, we developed a data quality control algorithms associated diseases related to diabetes in patients with hypertension. To make a data quality algorithm, we extracted hypertension patients from 2011 and 2012 discharge damage survey data. As the result of developing Data quality management algorithm, significant factors in hypertension patients with diabetes are gender, age, Glomerular disorders in diabetes mellitus, Diabetic retinopathy, Diabetic polyneuropathy, Closed [percutaneous] [needle] biopsy of kidney. Depending on the decision tree results, we defined Outlier which was probability values associated with a patient having diabetes corporal with hypertension or more than 80%, or not more than 20%, and found six groups with extreme values for diabetes accompanying hypertension patients. Thus there is a need to check the actual data contained in the Outlier(extreme value) groups to improve the quality of the data.

Development of Prediction Model for Prevalence of Metabolic Syndrome Using Data Mining: Korea National Health and Nutrition Examination Study (국민건강영양조사를 활용한 대사증후군 유병 예측모형 개발을 위한 융복합 연구: 데이터마이닝을 활용하여)

  • Kim, Han-Kyoul;Choi, Keun-Ho;Lim, Sung-Won;Rhee, Hyun-Sill
    • Journal of Digital Convergence
    • /
    • v.14 no.2
    • /
    • pp.325-332
    • /
    • 2016
  • The purpose of this study is to investigate the attributes influencing the prevalence of metabolic syndrome and develop the prediction model for metabolic syndrome over 40-aged people from Korea Health and Nutrition Examination Study 2012. The researcher chose the attributes for prediction model through literature review. Also, we used the decision tree, logistic regression, artificial neural network of data mining algorithm through Weka 3.6. As results, social economic status factors of input attributes were ranked higher than health-related factors. Additionally, prediction model using decision tree algorithm showed finally the highest accuracy. This study suggests that, first of all, prevention and management of metabolic syndrome will be approached by aspect of social economic status and health-related factors. Also, decision tree algorithms known from other research are useful in the field of public health due to their usefulness of interpretation.

Development of Advanced TB Case Classification Model Using NHI Claims Data (국민건강보험 청구자료 기반의 결핵환자 분류 고도화 모형 개발)

  • Park, Il-Su;Kim, Yoo-Mi;Choi, Youn-Hee;Kim, Sung-Soo;Kim, Eun-Ju;Won, Si-Yeon;Kang, Sung-Hong
    • Journal of Digital Convergence
    • /
    • v.11 no.9
    • /
    • pp.289-299
    • /
    • 2013
  • The aim of this study was to enhance the NHI claims data-based tuberculosis classification rule of KCDC(Korea centers for disease control & prevention) for an effective TB surveillance system. 8,118 cases, 10% samples of 81,199 TB cases from NHI claims data during 2009, were subject to the Medical Record Survey about whether they are real TB patients. The final study population was 7,132 cases whose medical records were surveyed. The decision tree model was evaluated as the most superior TB patients detection model. This model required the main independent variables of age, the number of anti-tuberculosis drugs, types of medical institution, tuberculosis tests, prescription days, types of TB. This model had sensitivity of 90.6%, PPV of 96.1%, and correct classification rate of 93.8%, which was better than KCDC's TB detection model with two or more NHI claims for TB and TB drugs(sensitivity of 82.6%, PPV of 95%, and correct classification rate of 80%).