• Title/Summary/Keyword: 의사결정 트리 기법

Search Result 97, Processing Time 0.023 seconds

An OpenPose-based Child Abuse Decision System using Surveillance Video (감시 영상을 활용한 OpenPose 기반 아동 학대 판단시스템)

  • Yoo, Hye-Rim;Lee, Bong-Hwan
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.3
    • /
    • pp.282-290
    • /
    • 2019
  • Recently child abuse has occurred frequently in educational institutions such as daycare center and kindergarten. Therefore, government made it mandatory to install CCTVs, but it is not easy to inspect the CCTV images. In this paper, we propose a model for judging child abuse using CCTV images. First of all, child abuse is a physical abuse of children by adults, thus a model for classifying adults and children is needed. The existing Haar scheme uses the frontal image to classify adults and children. However, the OpenPose allows to classify adults and children regardless of frontal and side image. In this research, a child abuse judgment model was designed and implemented by applying characteristics of adult and child posture when a child was abused. Since the implemented system utilizes the currently installed CCTV image, it is possible to monitor the child abuse in real time without any additional installation, which enables us to cope with the abuse promptly.

Spectral Band Selection for Detecting Fire Blight Disease in Pear Trees by Narrowband Hyperspectral Imagery (초분광 이미지를 이용한 배나무 화상병에 대한 최적 분광 밴드 선정)

  • Kang, Ye-Seong;Park, Jun-Woo;Jang, Si-Hyeong;Song, Hye-Young;Kang, Kyung-Suk;Ryu, Chan-Seok;Kim, Seong-Heon;Jun, Sae-Rom;Kang, Tae-Hwan;Kim, Gul-Hwan
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.23 no.1
    • /
    • pp.15-33
    • /
    • 2021
  • In this study, the possibility of discriminating Fire blight (FB) infection tested using the hyperspectral imagery. The reflectance of healthy and infected leaves and branches was acquired with 5 nm of full width at high maximum (FWHM) and then it was standardized to 10 nm, 25 nm, 50 nm, and 80 nm of FWHM. The standardized samples were divided into training and test sets at ratios of 7:3, 5:5 and 3:7 to find the optimal bands of FWHM by the decision tree analysis. Classification accuracy was evaluated using overall accuracy (OA) and kappa coefficient (KC). The hyperspectral reflectance of infected leaves and branches was significantly lower than those of healthy green, red-edge (RE) and near infrared (NIR) regions. The bands selected for the first node were generally 750 and 800 nm; these were used to identify the infection of leaves and branches, respectively. The accuracy of the classifier was higher in the 7:3 ratio. Four bands with 50 nm of FWHM (450, 650, 750, and 950 nm) might be reasonable because the difference in the recalculated accuracy between 8 bands with 10 nm of FWHM (440, 580, 640, 660, 680, 710, 730, and 740 nm) and 4 bands was only 1.8% for OA and 4.1% for KC, respectively. Finally, adding two bands (550 nm and 800 nm with 25 nm of FWHM) in four bands with 50 nm of FWHM have been proposed to improve the usability of multispectral image sensors with performing various roles in agriculture as well as detecting FB with other combinations of spectral bands.

Classification of Very High Concerns HRCT Images using Extended Bayesian Networks (확장 베이지안망을 적용한 고위험성 HRCT 영상 분류)

  • Lim, Chae-Gyun;Jung, Yong-Gyu
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.49 no.2
    • /
    • pp.7-12
    • /
    • 2012
  • Recently the medical field to efficiently process the vast amounts of information to decision trees, neural networks, Bayesian Networks, including the application method of various data mining techniques are investigated. In addition, the basic personal information or patient history, family history, in addition to information such as MRI, HRCT images and additional information to collect and leverage in the diagnosis of disease, improved diagnostic accuracy is to promote a common status. But in real world situations that affect the results much because of the variable exists for a particular data mining techniques to obtain information through the enemy can be seen fairly limited. Medical images were taken as well as a minor can not give a positive impact on the diagnosis, but the proportion increased subjective judgments by the automated system is to deal with difficult issues. As a result of a complex reality, the situation is more advantageous to deal with the relative probability of the multivariate model based on Bayesian network, or TAN in the K2 search algorithm improves due to expansion model has been proposed. At this point, depending on the type of search algorithm applied significantly influenced the performance characteristics of the extended Bayesian network, the performance and suitability of each technique for evaluation of the facts is required. In this paper, we extend the Bayesian network for diagnosis of diseases using the same data were carried out, K2, TAN and changes in search algorithms such as classification accuracy was measured. In the 10-fold cross-validation experiment was performed to compare the performance evaluation based on the analysis and the onset of high-risk classification for patients with HRCT images could be possible to identify high-risk data.

Managing the Reverse Extrapolation Model of Radar Threats Based Upon an Incremental Machine Learning Technique (점진적 기계학습 기반의 레이더 위협체 역추정 모델 생성 및 갱신)

  • Kim, Chulpyo;Noh, Sanguk
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.13 no.4
    • /
    • pp.29-39
    • /
    • 2017
  • Various electronic warfare situations drive the need to develop an integrated electronic warfare simulator that can perform electronic warfare modeling and simulation on radar threats. In this paper, we analyze the components of a simulation system to reversely model the radar threats that emit electromagnetic signals based on the parameters of the electronic information, and propose a method to gradually maintain the reverse extrapolation model of RF threats. In the experiment, we will evaluate the effectiveness of the incremental model update and also assess the integration method of reverse extrapolation models. The individual model of RF threats are constructed by using decision tree, naive Bayesian classifier, artificial neural network, and clustering algorithms through Euclidean distance and cosine similarity measurement, respectively. Experimental results show that the accuracy of reverse extrapolation models improves, while the size of the threat sample increases. In addition, we use voting, weighted voting, and the Dempster-Shafer algorithm to integrate the results of the five different models of RF threats. As a result, the final decision of reverse extrapolation through the Dempster-Shafer algorithm shows the best performance in its accuracy.

Estimation of River Flow Data Using Machine Learning (머신러닝 기법을 이용한 유량 자료 생산 방법)

  • Kang, Noel;Lee, Ji Hun;Lee, Jung Hoon;Lee, Chungdae
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2020.06a
    • /
    • pp.261-261
    • /
    • 2020
  • 물관리의 기본이 되는 연속적인 유량 자료 확보를 위해서는 정확도 높은 수위-유량 관계 곡선식 개발이 필수적이다. 수위-유량 관계곡선식은 모든 수문시설 설계의 기초가 되며 홍수, 가뭄 등 물재해 대응을 위해서도 중요한 의미를 가지고 있다. 그러나 일반적으로 유량 측정은 많은 비용과 시간이 들고, 식생성장, 단면변화 등의 통제특성(control)이 변함에 따라 구간분리, 기간분리와 같은 비선형적인 양상이 나타나 자료 해석에 어려움이 존재한다. 특히, 국내 하천의 경우 자연적 및 인위적인 환경 변화가 다양하여 지점 및 기간에 따라 세밀한 분석이 요구된다. 머신러닝(Machine Learning)이란 데이터를 통해 컴퓨터가 스스로 학습하여 모델을 구축하고 성능을 향상시키는 일련의 과정을 뜻한다. 기존의 수위-유량 관계곡선식은 개발자의 판단에 의해 데이터의 종류와 기간 등을 설정하여 회귀식의 파라미터를 산출한다면, 머신러닝은 유효한 전체 데이터를 이용해 스스로 학습하여 자료 간 상관성을 찾아내 모델을 구축하고 성능을 지속적으로 향상 시킬 수 있다. 머신러닝은 충분한 수문자료가 확보되었다는 전제 하에 복잡하고 가변적인 수자원 환경을 반영하여 유량 추정의 정확도를 지속적으로 향상시킬 수 있다는 이점을 가지고 있다. 본 연구는 머신러닝의 대표적인 알고리즘들을 활용하여 유량을 추정하는 모델을 구축하고 성능을 비교·분석하였다. 대상지역은 안정적인 수량을 확보하고 있는 한강수계의 거운교 지점이며, 사용자료는 2010~2018년의 시간, 수위, 유량, 수면폭 등 이다. 프로그램은 파이썬을 기반으로 한 머신러닝 라이브러리인 사이킷런(sklearn)을 사용하였고 알고리즘은 랜덤포레스트 회귀, 의사결정트리, KNN(K-Nearest Neighbor), rgboost을 적용하였다. 학습(train) 데이터는 입력자료 종류별로 조합하여 6개의 세트로 구분하여 모델을 구축하였고, 이를 적용해 검증(test) 데이터를 RMSE(Roog Mean Square Error)로 평가하였다. 그 결과 모델 및 입력 자료의 조합에 따라 3.67~171.46로 다소 넓은 범위의 값이 도출되었다. 그 중 가장 우수한 유형은 수위, 연도, 수면폭 3개의 입력자료를 조합하여 랜덤포레스트 회귀 모델에 적용한 경우이다. 비교를 위해 동일한 검증 데이터를 한국수문조사연보(2018년) 내거운교 지점의 수위별 수위-유량 곡선식을 이용해 유량을 추정한 결과 RMSE가 3.76이 산출되어, 머신러닝이 세분화된 수위-유량 곡선식과 비슷한 수준까지 성능을 내는 것으로 확인되었다. 본 연구는 양질의 유량자료 생산을 위해 기 구축된 수문자료를 기반으로 머신러닝 기법의 적용 가능성을 검토한 기초 연구로써, 국내 효율적인 수문자료 측정 및 수위-유량 곡선 산출에 도움이 될 수 있을 것으로 판단된다. 향후 수자원 환경 및 통제특성에 영향을 미치는 다양한 영향변수를 파악하기 위해 기상자료, 취수량 등의 입력 자료를 적용할 필요가 있으며, 머신러닝 내 비지도학습인 딥러닝과 같은 보다 정교한 모델에 대한 추가적인 연구도 수행되어야 할 것이다.

  • PDF

Prediction of golf scores on the PGA tour using statistical models (PGA 투어의 골프 스코어 예측 및 분석)

  • Lim, Jungeun;Lim, Youngin;Song, Jongwoo
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.1
    • /
    • pp.41-55
    • /
    • 2017
  • This study predicts the average scores of top 150 PGA golf players on 132 PGA Tour tournaments (2013-2015) using data mining techniques and statistical analysis. This study also aims to predict the Top 10 and Top 25 best players in 4 different playoffs. Linear and nonlinear regression methods were used to predict average scores. Stepwise regression, all best subset, LASSO, ridge regression and principal component regression were used for the linear regression method. Tree, bagging, gradient boosting, neural network, random forests and KNN were used for nonlinear regression method. We found that the average score increases as fairway firmness or green height or average maximum wind speed increases. We also found that the average score decreases as the number of one-putts or scrambling variable or longest driving distance increases. All 11 different models have low prediction error when predicting the average scores of PGA Tournaments in 2015 which is not included in the training set. However, the performances of Bagging and Random Forest models are the best among all models and these two models have the highest prediction accuracy when predicting the Top 10 and Top 25 best players in 4 different playoffs.

A Study on the Classification of Unstructured Data through Morpheme Analysis

  • Kim, SungJin;Choi, NakJin;Lee, JunDong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.4
    • /
    • pp.105-112
    • /
    • 2021
  • In the era of big data, interest in data is exploding. In particular, the development of the Internet and social media has led to the creation of new data, enabling the realization of the era of big data and artificial intelligence and opening a new chapter in convergence technology. Also, in the past, there are many demands for analysis of data that could not be handled by programs. In this paper, an analysis model was designed and verified for classification of unstructured data, which is often required in the era of big data. Data crawled DBPia's thesis summary, main words, and sub-keyword, and created a database using KoNLP's data dictionary, and tokenized words through morpheme analysis. In addition, nouns were extracted using KAIST's 9 part-of-speech classification system, TF-IDF values were generated, and an analysis dataset was created by combining training data and Y values. Finally, The adequacy of classification was measured by applying three analysis algorithms(random forest, SVM, decision tree) to the generated analysis dataset. The classification model technique proposed in this paper can be usefully used in various fields such as civil complaint classification analysis and text-related analysis in addition to thesis classification.