• Title/Summary/Keyword: Decision Tree

Search Result 1,658, Processing Time 0.032 seconds

A binary adaptive arithmetic coding algorithm based on adaptive symbol changes for lossless medical image compression (무손실 의료 영상 압축을 위한 적응적 심볼 교환에 기반을 둔 이진 적응 산술 부호화 방법)

  • 지창우;박성한
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.22 no.12
    • /
    • pp.2714-2726
    • /
    • 1997
  • In this paper, adaptive symbol changes-based medical image compression method is presented. First, the differenctial image domain is obtained using the differentiation rules or obaptive predictors applied to original mdeical image. Also, the algorithm determines the context associated with the differential image from the domain. Then prediction symbols which are thought tobe the most probable differential image values are maintained at a high value through the adaptive symbol changes procedure based on estimates of the symbols with polarity coincidence between the differential image values to be coded under to context and differential image values in the model template. At the coding step, the differential image values are encoded as "predicted" or "non-predicted" by the binary adaptive arithmetic encoder, where a binary decision tree is employed. The simlation results indicate that the prediction hit ratios of differential image values using the proposed algorithm improve the coding gain by 25% and 23% than arithmetic coder with ISO JPEG lossless predictor and arithmetic coder with differentiation rules or adaptive predictors, respectively. It can be used in compression part of medical PACS because the proposed method allows the encoder be directly applied to the full bit-planes medical image without a decomposition of the full bit-plane into a series of binary bit-planes as well as lower complexity of encoder through using an additions when sub-dividing recursively unit intervals.

  • PDF

Group Classification on Management Behavior of Diabetic Mellitus (당뇨 환자의 관리행태에 대한 군집 분류)

  • Kang, Sung-Hong;Choi, Soon-Ho
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.12 no.2
    • /
    • pp.765-774
    • /
    • 2011
  • The purpose of this study is to provide informative statistics which can be used for effective Diabetes Management Programs. We collected and analyzed the data of 666 diabetic people who had participated in Korean National Health and Nutrition Examination Survey in 2007 and 2008. Group classification on management behavior of Diabetic Mellitus is based on the K-means clustering method. The Decision Tree method and Multiple Regression Analysis were used to study factors of the management behavior of Diabetic Mellitus. Diabetic people were largely classified into three categories: Health Behavior Program Group, Focused Management Program Group, and Complication Test Program Group. First, Health Behavior Program Group means that even though drug therapy and complication test are being well performed, people should still need to improve their health behavior such as exercising regularly and avoid drinking and smoking. Second, Focused Management Program Group means that they show an uncooperative attitude about treatment and complication test and also take a passive action to improve their health behavior. Third, Complication Test Program Group means that they take a positive attitude about treatment and improving their health behavior but they pay no attention to complication test to detect acute and chronic disease early. The main factor for group classification was to prove whether they have hyperlipidemia or not. This varied widely with an individual's gender, income, age, occupation, and self rated health. To improve the rate of diabetic management, specialized diabetic management programs should be applied depending on each group's character.

Committee Learning Classifier based on Attribute Value Frequency (속성 값 빈도 기반의 전문가 다수결 분류기)

  • Lee, Chang-Hwan;Jung, In-Chul;Kwon, Young-S.
    • Journal of KIISE:Databases
    • /
    • v.37 no.4
    • /
    • pp.177-184
    • /
    • 2010
  • In these day, many data including sensor, delivery, credit and stock data are generated continuously in massive quantity. It is difficult to learn from these data because they are large in volume and changing fast in their concepts. To handle these problems, learning methods based in sliding window methods over time have been used. But these approaches have a problem of rebuilding models every time new data arrive, which requires a lot of time and cost. Therefore we need very simple incremental learning methods. Bayesian method is an example of these methods but it has a disadvantage which it requries the prior knowledge(probabiltiy) of data. In this study, we propose a learning method based on attribute values. In the proposed method, even though we don't know the prior knowledge(probability) of data, we can apply our new method to data. The main concept of this method is that each attribute value is regarded as an expert learner, summing up the expert learners lead to better results. Experimental results show our learning method learns from data very fast and performs well when compared to current learning methods(decision tree and bayesian).

Dependency-based Framework of Combining Multiple Experts for Recognizing Unconstrained Handwritten Numerals (무제약 필기 숫자를 인식하기 위한 다수 인식기를 결합하는 의존관계 기반의 프레임워크)

  • Kang, Hee-Joong;Lee, Seong-Whan
    • Journal of KIISE:Software and Applications
    • /
    • v.27 no.8
    • /
    • pp.855-863
    • /
    • 2000
  • Although Behavior-Knowledge Space (BKS) method, one of well known decision combination methods, does not need any assumptions in combining the multiple experts, it should theoretically build exponential storage spaces for storing and managing jointly observed K decisions from K experts. That is, combining K experts needs a (K+1)st-order probability distribution. However, it is well known that the distribution becomes unmanageable in storing and estimating, even for a small K. In order to overcome such weakness, it has been studied to decompose a probability distribution into a number of component distributions and to approximate the distribution with a product of the component distributions. One of such previous works is to apply a conditional independence assumption to the distribution. Another work is to approximate the distribution with a product of only first-order tree dependencies or second-order distributions as shown in [1]. In this paper, higher order dependency than the first-order is considered in approximating the distribution and a dependency-based framework is proposed to optimally approximate the (K+1)st-order probability distribution with a product set of dth-order dependencies where ($1{\le}d{\le}K$), and to combine multiple experts based on the product set using the Bayesian formalism. This framework was experimented and evaluated with a standardized CENPARMI data base.

  • PDF

A Study on the Analysis Effect Factors of Illegal Parking Using Data Mining Techniques (데이터마이닝 기법을 활용한 불법주차 영향요인 분석)

  • Lee, Chang-Hee;Kim, Myung-Soo;Seo, So-Min
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.13 no.4
    • /
    • pp.63-72
    • /
    • 2014
  • With the rapid development in the economy and other fields as well, the standard of living in South Korea has been improved, and consequently, the demand of automobiles has quickly increased. It leads to various traffic issues such as traffic congestion, traffic accident, and parking problem. In particular, this illegal parking caused by the increase in the number of automobiles has been considered one of the main reasons to bring about traffic congestion as intensifying any dispute between neighbors in relation to a parking space, which has been also coming to the fore as a social issue. Therefore, this study looked into Daejeon Metropolitan City, the city that is understood to have the highest automobile sharing rate in South Korea but with relatively few cases of illegal parking crackdowns. In order to investigate the theoretical problems of the illegal parking, this study conducted a decision-making tree model-based Exhaustive CHAID analysis to figure out not only what makes drivers park illegally when they try to park vehicles but also those factors that would tempt the drivers into the illegal parking. The study, then, comes up with solutions to the problem. According to the analysis, in terms of the influential factors that encourage the drivers to park at some illegal areas, it was learned that these factors, the distance, a driver's experience of getting caught, the occupation and the use time in order, have an effect on the drivers' deciding to park illegally. After working on the prediction model, four nodes were finally extracted. Given the analysis result, as a solution to the illegal parking, it is necessary to establish public parking lots additionally and first secure the parking space for the vehicles used for living and working, and to activate the campaign for enhancing illegal parking crackdown and encouraging civic consciousness.

Affected Model of Indoor Radon Concentrations Based on Lifestyle, Greenery Ratio, and Radon Levels in Groundwater (생활 습관, 주거지 주변 녹지 비율 및 지하수 내 라돈 농도 따른 실내 라돈 농도 영향 모델)

  • Lee, Hyun Young;Park, Ji Hyun;Lee, Cheol-Min;Kang, Dae Ryong
    • Journal of health informatics and statistics
    • /
    • v.42 no.4
    • /
    • pp.309-316
    • /
    • 2017
  • Objectives: Radon and its progeny pose environmental risks as a carcinogen, especially to the lungs. Investigating factors affecting indoor radon concentrations and models thereof are needed to prevent exposure to radon and to reduce indoor radon concentrations. The purpose of this study was to identify factors affecting indoor radon concentration and to construct a comprehensive model thereof. Methods: Questionnaires were administered to obtain data on residential environments, including building materials and life style. Decision tree and structural equation modeling were applied to predict residences at risk for higher radon concentrations and to develop the comprehensive model. Results: Greenery ratio, impermeable layer ratio, residence at ground level, daily ventilation, long-term heating, crack around the measuring device, and bedroom were significantly shown to be predictive factors of higher indoor radon concentrations. Daily ventilation reduced the probability of homes having indoor radon concentrations ${\geq}200Bq/m^3$ by 11.6%. Meanwhile, a greenery ratio ${\geq}65%$ without daily ventilation increased this probability by 15.3% compared to daily ventilation. The constructed model indicated greenery ratio and ventilation rate directly affecting indoor radon concentrations. Conclusions: Our model highlights the combined influences of geographical properties, groundwater, and lifestyle factors of an individual resident on indoor radon concentrations in Korea.

Factors analysis of the cyanobacterial dominance in the four weirs installed in of Nakdong River (낙동강의 중·하류 4개보에서 남조류 우점 환경 요인 분석)

  • Kim, Sung jin;Chung, Se woong;Park, Hyung seok;Cho, Young cheol;Lee, Hee suk
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2019.05a
    • /
    • pp.413-413
    • /
    • 2019
  • 하천과 호수에서 남조류의 이상 과잉증식 문제(이하 녹조문제)는 담수생태계의 생물다양성을 감소시키며, 음용수의 이취미 원인물질을 발생시켜 물 이용에 장해가 된다. 또한 독소를 생산하는 유해남조류가 대량 증식할 경우에는 가축이나 인간의 건강에 치명적 해를 끼치기도 한다. 그 동안 국내에서 녹조문제는 댐 저수지와 하구호와 같은 정체수역에서 간헐적으로 문제를 일으켰으나, 4대강사업(2010-2011)으로 16개의 보가 설치된 이후 낙동강, 금강, 영산강 등 대하천에서도 광범위하게 발생되고 있어 중요한 사회적 환경적 이슈로 대두되었다. 한편, 대하천에 설치된 보 구간에서 빈번히 발생하는 녹조현상의 원인에 대해서는 전 지구적 기온상승에 따른 기후변화의 영향이라는 주장과 유역으로부터 영양염류의 과도한 유입, 가뭄에 따른 유량감소, 보 설치에 따른 체류시간 증가 등 다양한 의견이 제시되고 있으나, 대상 유역과 수체의 특성에 따라 녹조 발생의 원인이 상이하거나 또는 다양한 요인이 복합적으로 작용하기 때문에 보편적 해석(universal interpretation)이 어려운 것이 현실이다. 따라서 각 수계별, 보별 녹조현상에 대한 정확한 원인분석과 효과적인 대책 마련을 위해서는 집중된 실험자료와 데이터마이닝 기법에 근거로 한 보다 과학적이고 객관적인 접근이 이루어져야 한다. 본 연구에서는 2012년 보 설치 이후 남조류에 의한 녹조현상이 빈번히 발생하고 있는 낙동강 4개보(강정고령보, 달성보, 합천창녕보, 창녕함안보)를 대상으로 집중적인 현장조사와 실험분석을 수행하고, 수집된 기상, 수문, 수질, 조류 자료에 대해 통계분석과 다양한 데이터모델링 기법을 적용하여 보별 남조류 우점 환경조건과 이를 제어하기 위한 주요 조절변수를 규명하는데 있다. 연구대상 보 별 수질과 식물플랑크톤의 정성 및 정량 실험은 2017년 5월부터 2018년 11월까지 2년에 걸쳐 실시하였으며, 남조류 세포수 밀도와 환경요인과의 상관성 분석을 실시하고, 단계적 다중회귀모델(Step-wise Multiple Linear Regressions, SMLR), 랜덤포레스트(Random Forests, RF) 모델과 재귀적 변수 제거 기법(Recursive Feature Elimination using Random Forest, RFE-RF)을 이용한 변수중요도 평가, 의사결정나무(Decision Tree, DT), 주성분분석(Principal Component Analysis, PCA) 기법 등 다양한 모수적 및 비모수적 데이터마이닝 결과를 바탕으로 각 보별 남 조류 우점 환경요인을 종합적으로 해석하였다.

  • PDF

Exploring Feature Selection Methods for Effective Emotion Mining (효과적 이모션마이닝을 위한 속성선택 방법에 관한 연구)

  • Eo, Kyun Sun;Lee, Kun Chang
    • Journal of Digital Convergence
    • /
    • v.17 no.3
    • /
    • pp.107-117
    • /
    • 2019
  • In the era of SNS, many people relies on it to express their emotions about various kinds of products and services. Therefore, for the companies eagerly seeking to investigate how their products and services are perceived in the market, emotion mining tasks using dataset from SNSs become important much more than ever. Basically, emotion mining is a branch of sentiment analysis which is based on BOW (bag-of-words) and TF-IDF. However, there are few studies on the emotion mining which adopt feature selection (FS) methods to look for optimal set of features ensuring better results. In this sense, this study aims to propose FS methods to conduct emotion mining tasks more effectively with better outcomes. This study uses Twitter and SemEval2007 dataset for the sake of emotion mining experiments. We applied three FS methods such as CFS (Correlation based FS), IG (Information Gain), and ReliefF. Emotion mining results were obtained from applying the selected features to nine classifiers. When applying DT (decision tree) to Tweet dataset, accuracy increases with CFS, IG, and ReliefF methods. When applying LR (logistic regression) to SemEval2007 dataset, accuracy increases with ReliefF method.

Convergence Research on Relationships among the inhibiting factors of Dying Well (웰다잉 저해 요인의 관련성에 관한 융합 연구)

  • Lee, Chong Hyung;Ahn, Sang-Yoon;Kim, Yong-Ha;Kim, Kwang-Hwan
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.8
    • /
    • pp.37-44
    • /
    • 2019
  • The purpose of this study is to determine the inhibiting factors of dying well for people who want to have a good death. The final respondents in this study were sampled using stratified proportional allocation using a stratified random sampling method, and 1,000 adults aged between 19 and 75 years were selected. The questionnaire used consisted of four items on general characteristics and 20 items related to the inhibiting factors of dying well scored on a 7-point Likert scale. Analysis was conducted using descriptive statistics, correlation analysis, and decision tree analysis. Results showed that, among the inhibiting factors of dying well, "degenerative diseases (such as dementia)" and "loss of control (mental / physical)" scored 5.502 and 5.268 points, respectively; the highest significant positive correlation was found between "bad marital relationship" and "bad relationship with children," followed by "did not receive death education" and "lack of medical policy promotion (dying well)" and "bad relationship with children" and "indifference of others." Considering these findings, it appears that the whole society will make efforts to improve the perception and practice of good death, and life and death education will be expanded if death education for dying well is organized and implemented.

A Study on the Development of Readmission Predictive Model (재입원 예측 모형 개발에 관한 연구)

  • Cho, Yun-Jung;Kim, Yoo-Mi;Han, Seung-Woo;Choe, Jun-Yeong;Baek, Seol-Gyeong;Kang, Sung-Hong
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.20 no.4
    • /
    • pp.435-447
    • /
    • 2019
  • In order to prevent unnecessary re-admission, it is necessary to intensively manage the groups with high probability of re-admission. For this, it is necessary to develop a re-admission prediction model. Two - year discharge summary data of one university hospital were collected from 2016 to 2017 to develop a predictive model of re-admission. In this case, the re-admitted patients were defined as those who were discharged more than once during the study period. We conducted descriptive statistics and crosstab analysis to identify the characteristics of rehospitalized patients. The re-admission prediction model was developed using logistic regression, neural network, and decision tree. AUC (Area Under Curve) was used for model evaluation. The logistic regression model was selected as the final re-admission predictive model because the AUC was the best at 0.81. The main variables affecting the selected rehospitalization in the logistic regression model were Residental regions, Age, CCS, Charlson Index Score, Discharge Dept., Via ER, LOS, Operation, Sex, Total payment, and Insurance. The model developed in this study was limited to generalization because it was two years data of one hospital. It is necessary to develop a model that can collect and generalize long-term data from various hospitals in the future. Furthermore, it is necessary to develop a model that can predict the re-admission that was not planned.