• Title/Summary/Keyword: Decision-tree model

Search Result 731, Processing Time 0.028 seconds

Key Food Selection for Assessement of Oral Health Related Quality of Life among Some Korean Elderly (일부 한국 노인 구강건강 관련 삶의 질 평가를 위한 핵심 음식 선택)

  • Hwang, Soo-Jeong
    • Journal of dental hygiene science
    • /
    • v.16 no.5
    • /
    • pp.361-369
    • /
    • 2016
  • Oral health can influence on diverse food intake, and food intake affect oral health related quality of life. The aim of this study was to select key foods to be able to represent oral health related quality of life in Korea. We used the data of 503 Korean older persons to participate in the oral health promotion programme in 2009. The low consumption or low intake foods with criteria in 2012 National Nutrition Statistics were eliminated among 30 foods of food intake ability (FIA) at first. Decision tree model, correlation analysis, factor analysis, and internal reliablity test were used for oral health related quailty of life (OHRQoL) key food selection. We selected 13 foods-hard persimmon, dried peanut, pickled radish, caramel, rib of pork, glutinous rice cake, cabbage kimchi, apple, yellow melon, boiled chicken meat, boiled fish, mandarin, noodles as OHRQoL Key Foods 13. Thirty foods of FIA and OHRQoL Key Foods 13 displayed the same pattern of variation among sociodemographic groups. In a regression model, both of 30 foods of FIA and OHRQoL Key Foods 13 influenced on oral health impact profile-14. The findings suggest that OHRQoL Key Foods 13 have good reliability and validity and be able to use in oral health survey.

A Study on the Analysis Effect Factors of Illegal Parking Using Data Mining Techniques (데이터마이닝 기법을 활용한 불법주차 영향요인 분석)

  • Lee, Chang-Hee;Kim, Myung-Soo;Seo, So-Min
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.13 no.4
    • /
    • pp.63-72
    • /
    • 2014
  • With the rapid development in the economy and other fields as well, the standard of living in South Korea has been improved, and consequently, the demand of automobiles has quickly increased. It leads to various traffic issues such as traffic congestion, traffic accident, and parking problem. In particular, this illegal parking caused by the increase in the number of automobiles has been considered one of the main reasons to bring about traffic congestion as intensifying any dispute between neighbors in relation to a parking space, which has been also coming to the fore as a social issue. Therefore, this study looked into Daejeon Metropolitan City, the city that is understood to have the highest automobile sharing rate in South Korea but with relatively few cases of illegal parking crackdowns. In order to investigate the theoretical problems of the illegal parking, this study conducted a decision-making tree model-based Exhaustive CHAID analysis to figure out not only what makes drivers park illegally when they try to park vehicles but also those factors that would tempt the drivers into the illegal parking. The study, then, comes up with solutions to the problem. According to the analysis, in terms of the influential factors that encourage the drivers to park at some illegal areas, it was learned that these factors, the distance, a driver's experience of getting caught, the occupation and the use time in order, have an effect on the drivers' deciding to park illegally. After working on the prediction model, four nodes were finally extracted. Given the analysis result, as a solution to the illegal parking, it is necessary to establish public parking lots additionally and first secure the parking space for the vehicles used for living and working, and to activate the campaign for enhancing illegal parking crackdown and encouraging civic consciousness.

Evaluation of a Thermal Conductivity Prediction Model for Compacted Clay Based on a Machine Learning Method (기계학습법을 통한 압축 벤토나이트의 열전도도 추정 모델 평가)

  • Yoon, Seok;Bang, Hyun-Tae;Kim, Geon-Young;Jeon, Haemin
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.41 no.2
    • /
    • pp.123-131
    • /
    • 2021
  • The buffer is a key component of an engineered barrier system that safeguards the disposal of high-level radioactive waste. Buffers are located between disposal canisters and host rock, and they can restrain the release of radionuclides and protect canisters from the inflow of ground water. Since considerable heat is released from a disposal canister to the surrounding buffer, the thermal conductivity of the buffer is a very important parameter in the entire disposal safety. For this reason, a lot of research has been conducted on thermal conductivity prediction models that consider various factors. In this study, the thermal conductivity of a buffer is estimated using the machine learning methods of: linear regression, decision tree, support vector machine (SVM), ensemble, Gaussian process regression (GPR), neural network, deep belief network, and genetic programming. In the results, the machine learning methods such as ensemble, genetic programming, SVM with cubic parameter, and GPR showed better performance compared with the regression model, with the ensemble with XGBoost and Gaussian process regression models showing best performance.

A Study on the Classification of Unstructured Data through Morpheme Analysis

  • Kim, SungJin;Choi, NakJin;Lee, JunDong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.4
    • /
    • pp.105-112
    • /
    • 2021
  • In the era of big data, interest in data is exploding. In particular, the development of the Internet and social media has led to the creation of new data, enabling the realization of the era of big data and artificial intelligence and opening a new chapter in convergence technology. Also, in the past, there are many demands for analysis of data that could not be handled by programs. In this paper, an analysis model was designed and verified for classification of unstructured data, which is often required in the era of big data. Data crawled DBPia's thesis summary, main words, and sub-keyword, and created a database using KoNLP's data dictionary, and tokenized words through morpheme analysis. In addition, nouns were extracted using KAIST's 9 part-of-speech classification system, TF-IDF values were generated, and an analysis dataset was created by combining training data and Y values. Finally, The adequacy of classification was measured by applying three analysis algorithms(random forest, SVM, decision tree) to the generated analysis dataset. The classification model technique proposed in this paper can be usefully used in various fields such as civil complaint classification analysis and text-related analysis in addition to thesis classification.

Improving Efficiency of Food Hygiene Surveillance System by Using Machine Learning-Based Approaches (기계학습을 이용한 식품위생점검 체계의 효율성 개선 연구)

  • Cho, Sanggoo;Cho, Seung Yong
    • The Journal of Bigdata
    • /
    • v.5 no.2
    • /
    • pp.53-67
    • /
    • 2020
  • This study employees a supervised learning prediction model to detect nonconformity in advance of processed food manufacturing and processing businesses. The study was conducted according to the standard procedure of machine learning, such as definition of objective function, data preprocessing and feature engineering and model selection and evaluation. The dependent variable was set as the number of supervised inspection detections over the past five years from 2014 to 2018, and the objective function was to maximize the probability of detecting the nonconforming companies. The data was preprocessed by reflecting not only basic attributes such as revenues, operating duration, number of employees, but also the inspections track records and extraneous climate data. After applying the feature variable extraction method, the machine learning algorithm was applied to the data by deriving the company's risk, item risk, environmental risk, and past violation history as feature variables that affect the determination of nonconformity. The f1-score of the decision tree, one of ensemble models, was much higher than those of other models. Based on the results of this study, it is expected that the official food control for food safety management will be enhanced and geared into the data-evidence based management as well as scientific administrative system.

Analysis of the Impact of Satellite Remote Sensing Information on the Prediction Performance of Ungauged Basin Stream Flow Using Data-driven Models (인공위성 원격 탐사 정보가 자료 기반 모형의 미계측 유역 하천유출 예측성능에 미치는 영향 분석)

  • Seo, Jiyu;Jung, Haeun;Won, Jeongeun;Choi, Sijung;Kim, Sangdan
    • Journal of Wetlands Research
    • /
    • v.26 no.2
    • /
    • pp.147-159
    • /
    • 2024
  • Lack of streamflow observations makes model calibration difficult and limits model performance improvement. Satellite-based remote sensing products offer a new alternative as they can be actively utilized to obtain hydrological data. Recently, several studies have shown that artificial intelligence-based solutions are more appropriate than traditional conceptual and physical models. In this study, a data-driven approach combining various recurrent neural networks and decision tree-based algorithms is proposed, and the utilization of satellite remote sensing information for AI training is investigated. The satellite imagery used in this study is from MODIS and SMAP. The proposed approach is validated using publicly available data from 25 watersheds. Inspired by the traditional regionalization approach, a strategy is adopted to learn one data-driven model by integrating data from all basins, and the potential of the proposed approach is evaluated by using a leave-one-out cross-validation regionalization setting to predict streamflow from different basins with one model. The GRU + Light GBM model was found to be a suitable model combination for target basins and showed good streamflow prediction performance in ungauged basins (The average model efficiency coefficient for predicting daily streamflow in 25 ungauged basins is 0.7187) except for the period when streamflow is very small. The influence of satellite remote sensing information was found to be up to 10%, with the additional application of satellite information having a greater impact on streamflow prediction during low or dry seasons than during wet or normal seasons.

Prioritization of Species Selection Criteria for Urban Fine Dust Reduction Planting (도시 미세먼지 저감 식재를 위한 수종 선정 기준의 우선순위 도출)

  • Cho, Dong-Gil
    • Korean Journal of Environment and Ecology
    • /
    • v.33 no.4
    • /
    • pp.472-480
    • /
    • 2019
  • Selection of the plant material for planting to reduce fine dust should comprehensively consider the visual characteristics, such as the shape and texture of the plant leaves and form of bark, which affect the adsorption function of the plant. However, previous studies on reduction of fine dust through plants have focused on the absorption function rather than the adsorption function of plants and on foliage plants, which are indoor plants, rather than the outdoor plants. In particular, the criterion for selection of fine dust reduction species is not specific, so research on the selection criteria for plant materials for fine dust reduction in urban areas is needed. The purpose of this study is to identify the priorities of eight indicators that affect the fine dust reduction by using the fuzzy multi-criteria decision-making model (MCDM) and establish the tree selection criteria for the urban planting to reduce fine dust. For the purpose, we conducted a questionnaire survey of those who majored in fine dust-related academic fields and those with experience of researching fine dust. A result of the survey showed that the area of leaf and the tree species received the highest score as the factors that affect the fine dust reduction. They were followed by the surface roughness of leaves, tree height, growth rate, complexity of leaves, edge shape of leaves, and bark feature in that order. When selecting the species that have leaves with the coarse surface, it is better to select the trees with wooly, glossy, and waxy layers on the leaves. When considering the shape of the leaves, it is better to select the two-type or three-type leaves and palm-shaped leaves than the single-type leaves and to select the serrated leaves than the smooth edged leaves to increase the surface area for adsorbing fine dust in the air on the surface of the leaves. When considering the characteristics of the bark, it is better to select trees that have cork layers or show or are likely to show the bark loosening or cracks than to select those with lenticel or patterned barks. This study is significant in that it presents the priorities of the selection criteria of plant material based on the visual characteristics that affect the adsorption of fine dust for the planning of planting to reduce fine dust in the urban area. The results of this study can be used as basic data for the selection of trees for plantation planning in the urban area.

Development of 1ST-Model for 1 hour-heavy rain damage scale prediction based on AI models (1시간 호우피해 규모 예측을 위한 AI 기반의 1ST-모형 개발)

  • Lee, Joonhak;Lee, Haneul;Kang, Narae;Hwang, Seokhwan;Kim, Hung Soo;Kim, Soojun
    • Journal of Korea Water Resources Association
    • /
    • v.56 no.5
    • /
    • pp.311-323
    • /
    • 2023
  • In order to reduce disaster damage by localized heavy rains, floods, and urban inundation, it is important to know in advance whether natural disasters occur. Currently, heavy rain watch and heavy rain warning by the criteria of the Korea Meteorological Administration are being issued in Korea. However, since this one criterion is applied to the whole country, we can not clearly recognize heavy rain damage for a specific region in advance. Therefore, in this paper, we tried to reset the current criteria for a special weather report which considers the regional characteristics and to predict the damage caused by rainfall after 1 hour. The study area was selected as Gyeonggi-province, where has more frequent heavy rain damage than other regions. Then, the rainfall inducing disaster or hazard-triggering rainfall was set by utilizing hourly rainfall and heavy rain damage data, considering the local characteristics. The heavy rain damage prediction model was developed by a decision tree model and a random forest model, which are machine learning technique and by rainfall inducing disaster and rainfall data. In addition, long short-term memory and deep neural network models were used for predicting rainfall after 1 hour. The predicted rainfall by a developed prediction model was applied to the trained classification model and we predicted whether the rain damage after 1 hour will be occurred or not and we called this as 1ST-Model. The 1ST-Model can be used for preventing and preparing heavy rain disaster and it is judged to be of great contribution in reducing damage caused by heavy rain.

Comparison of Hospital Standardized Mortality Ratio Using National Hospital Discharge Injury Data (퇴원손상심층조사 자료를 이용한 의료기관 중증도 보정 사망비 비교)

  • Park, Jong-Ho;Kim, Yoo-Mi;Kim, Sung-Soo;Kim, Won-Joong;Kang, Sung-Hong
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.13 no.4
    • /
    • pp.1739-1750
    • /
    • 2012
  • This study was to develop the assessment of medical service outcome using administration data through compared with hospital standardized mortality ratios(HSMR) in various hospitals. This study analyzed 63,664 cases of Hospital Discharge Injury Data of 2007 and 2008, provided by Korea Centers for Disease Control and Prevention. We used data mining technique and compared decision tree and logistic regression for developing risk-adjustment model of in-hospital mortality. Our Analysis shows that gender, length of stay, Elixhauser comorbidity index, hospitalization path, and primary diagnosis are main variables which influence mortality ratio. By comparing hospital standardized mortality ratios(HSMR) with standardized variables, we found concrete differences (55.6-201.6) of hospital standardized mortality ratios(HSMR) among hospitals. This proves that there are quality-gaps of medical service among hospitals. This study outcome should be utilized more to achieve the improvement of the quality of medical service.

Implementation of an Efficient Microbial Medical Image Retrieval System Applying Knowledge Databases (지식 데이타베이스를 적용한 효율적인 세균 의료영상 검색 시스템의 구현)

  • Shin Yong Won;Koo Bong Oh
    • Journal of the Korea Society of Computer and Information
    • /
    • v.10 no.1 s.33
    • /
    • pp.93-100
    • /
    • 2005
  • This study is to desist and implement an efficient microbial medical image retrieval system based on knowledge and content of them which can make use of more accurate decision on colony as doll as efficient education for new techicians. For this. re first address overall inference to set up flexible search path using rule-base in order U redure time required original microbial identification by searching the fastest path of microbial identification phase based on heuristics knowledge. Next, we propose a color ffature gfraction mtU, which is able to extract color feature vectors of visual contents from a inn microbial image based on especially bacteria image using HSV color model. In addition, for better retrieval performance based on large microbial databases, we present an integrated indexing technique that combines with B+-tree for indexing simple attributes, inverted file structure for text medical keywords list, and scan-based filtering method for high dimensional color feature vectors. Finally. the implemented system shows the possibility to manage and retrieve the complex microbial images using knowledge and visual contents itself effectively. We expect to decrease rapidly Loaming time for elementary technicians by tell organizing knowledge of clinical fields through proposed system.

  • PDF