• Title/Summary/Keyword: Decision-Tree-Model

Search Result 731, Processing Time 0.027 seconds

Developing Library Tour Course Recommendation Model based on a Traveler Persona: Focused on facilities and routes for library trips in J City (여행자 페르소나 기반 도서관 여행 코스 추천 모델 개발 - J시 도서관 여행을 위한 시설 및 동선 중심으로 -)

  • Suhyeon Lee;Hyunsoo Kim;Jiwon Baek;Hyo-Jung Oh
    • Journal of Korean Library and Information Science Society
    • /
    • v.54 no.2
    • /
    • pp.23-42
    • /
    • 2023
  • The library tour program is a new type of cultural program that was first introduced and operated by J City, and library tourists travel to specialized libraries in the city according to a set course and experience various experiences. This study aims to build a customized course recommendation model that considers the characteristics of individual participants in addition to the existing fixed group travel format so that more users can enjoy the opportunity to participate in library tours. To this end, the characteristics of library travelers were categorized to establish traveler personas, and library evaluation items and evaluation criteria were established accordingly. We selected 22 libraries targeted by the library travel program and measured library data through actual visits. Based on the collected data, we derived the characteristics of suitable libraries and developed a persona-based library tour course recommendation model using a decision tree algorithm. To demonstrate the feasibility of the proposed recommendation model, we build a mobile application mockup, and conducted user evaluations with actual library users to identify satisfaction and improvements to the developed model.

Verification Test of High-Stability SMEs Using Technology Appraisal Items (기술력 평가항목을 이용한 고안정성 중소기업 판별력 검증)

  • Jun-won Lee
    • Information Systems Review
    • /
    • v.20 no.4
    • /
    • pp.79-96
    • /
    • 2018
  • This study started by focusing on the internalization of the technology appraisal model into the credit rating model to increase the discriminative power of the credit rating model not only for SMEs but also for all companies, reflecting the items related to the financial stability of the enterprises among the technology appraisal items. Therefore, it is aimed to verify whether the technology appraisal model can be applied to identify high-stability SMEs in advance. We classified companies into industries (manufacturing vs. non-manufacturing) and the age of company (initial vs. non-initial), and defined as a high-stability company that has achieved an average debt ratio less than 1/2 of the group for three years. The C5.0 was applied to verify the discriminant power of the model. As a result of the analysis, there is a difference in importance according to the type of industry and the age of company at the sub-item level, but in the mid-item level the R&D capability was a key variable for discriminating high-stability SMEs. In the early stage of establishment, the funding capacity (diversification of funding methods, capital structure and capital cost which taking into account profitability) is an important variable in financial stability. However, we concluded that technology development infrastructure, which enables continuous performance as the age of company increase, becomes an important variable affecting financial stability. The classification accuracy of the model according to the age of company and industry is 71~91%, and it is confirmed that it is possible to identify high-stability SMEs by using technology appraisal items.

Development of Diameter Distribution Change and Site Index in a Stand of Robinia pseudoacacia, a Major Honey Plant (꿀샘식물 아까시나무의 지위지수 도출 및 직경분포 변화)

  • Kim, Sora;Song, Jungeun;Park, Chunhee;Min, Suhui;Hong, Sunghee;Yun, Junhyuk;Son, Yeongmo
    • Journal of Korean Society of Forest Science
    • /
    • v.111 no.2
    • /
    • pp.311-318
    • /
    • 2022
  • We conducted this study to derive the site index, which is a criterion for the planting of Robinia pseudoacacia, a honey plant, and to investigate the diameter distribution change by derived site index. We applied the Chapman-Richards equation model to estimate the site index of the Robinia pseudoacacia stand. The site index was distributed within the range of 16-22 when the base age was 30 years. The fitness index of the site index estimation model was low, but we judged that there was no problem in the application because the residual distribution of the equation had not shifted to one side. We used the Weibull diameter distribution function to determine the diameter distribution of the Robinia pseudoacacia stand by site index. We used the mean diameter and the dominant tree height as independent variables to present the diameter distribution, and our analysis procedure was to estimate and recover the parameters of the Weibull diameter distribution function. We used the mean diameter and the dominant tree height of the Robinia pseudoacacia stand to show distribution by diameter class, and the fitness index for dbh distribution estimation was about 80.5%. As a result of schematizing the diameter distribution by site indices as a 30-year-old, we found that the higher the site index, the more the curve of the diameter distribution moved to the right. This suggests that if the plantation were to be established in a high site index stand, considering the suitable trees on the site, the growth of Robinia pseudoacacia woul d become active, and not onl y the production of wood but al so the production of honey would increase. We therefore anticipate that the site index classification table and curve of this Robinia pseudoacacia stand will become the standard for decision making in the plantation and management of this tree.

The big data method for flash flood warning (돌발홍수 예보를 위한 빅데이터 분석방법)

  • Park, Dain;Yoon, Sanghoo
    • Journal of Digital Convergence
    • /
    • v.15 no.11
    • /
    • pp.245-250
    • /
    • 2017
  • Flash floods is defined as the flooding of intense rainfall over a relatively small area that flows through river and valley rapidly in short time with no advance warning. So that it can cause damage property and casuality. This study is to establish the flash-flood warning system using 38 accident data, reported from the National Disaster Information Center and Land Surface Model(TOPLATS) between 2009 and 2012. Three variables were used in the Land Surface Model: precipitation, soil moisture, and surface runoff. The three variables of 6 hours preceding flash flood were reduced to 3 factors through factor analysis. Decision tree, random forest, Naive Bayes, Support Vector Machine, and logistic regression model are considered as big data methods. The prediction performance was evaluated by comparison of Accuracy, Kappa, TP Rate, FP Rate and F-Measure. The best method was suggested based on reproducibility evaluation at the each points of flash flood occurrence and predicted count versus actual count using 4 years data.

An Empirical Study of Profiling Model for the SMEs with High Demand for Standards Using Data Mining (데이터마이닝을 이용한 표준정책 수요 중소기업의 프로파일링 연구: R&D 동기와 사업화 지원 정책을 중심으로)

  • Jun, Seung-pyo;Jung, JaeOong;Choi, San
    • Journal of Korea Technology Innovation Society
    • /
    • v.19 no.3
    • /
    • pp.511-544
    • /
    • 2016
  • Standards boost technological innovation by promoting information sharing, compatibility, stability and quality. Identifying groups of companies that particularly benefit from these functions of standards in their technological innovation and commercialization helps to customize planning and implementation of standards-related policies for demand groups. For this purpose, this study engages in profiling of SMEs whose R&D objective is to respond to standards as well as those who need to implement standards system for technological commercialization. Then it suggests a prediction model that can distinguish such companies from others. To this end, decision tree analysis is conducted for profiling of characteristics of subject SMEs through data mining. Subject SMEs include (1) those that engage in R&D to respond to standards (Group1) or (2) those in need of product standard or technological certification policies for commercialization purposes (Group 2). Then the study proposes a prediction model that can distinguish Groups 1 and 2 from others based on several variables by adopting discriminant analysis. The practicality of discriminant formula is statistically verified. The study suggests that Group 1 companies are distinguished in variables such as time spent on R&D planning, KoreanStandardIndustryClassification (KSIC) category, number of employees and novelty of technologies. Profiling result of Group 2 companies suggests that they are differentiated in variables such as KSIC category, major clients of the companies, time spent on R&D and ability to test and verify their technologies. The prediction model proposed herein is designed based on the outcomes of profiling and discriminant analysis. Its purpose is to serve in the planning or implementation processes of standards-related policies through providing objective information on companies in need of relevant support and thereby to enhance overall success rate of standards-related projects.

Determinants of employee's wage using hierarchical linear model (위계적 선형모형을 이용한 대졸 신규취업자 임금 결정요인 분석)

  • Park, Sungik;Cho, Jangsik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.1
    • /
    • pp.65-75
    • /
    • 2015
  • This paper analyzes the determinants of wage for the college and university graduates utilizing both individual-level and industry-level variables. We note that wage determination has multi-level structure in the sense that individual wage is influenced by individual-level variables (level-1) and industry-level (level-2) variables. Then, the assumption that individual wage is independent in the classical regression is violated. Therefore, this paper utilizes the hierarchical linear model (HLM). The major results are the followings. First, the multiple correspondence analysis including level-1 and 2 variables reveals that both level 1 and level 2 variables affects individual wages judging from the fact that the values of level 1 and level 2 variables differ across the different level of individual wage groups. Second, the decision tree analysis including level-1 and 2 variables shows that the most influential variable in wage determination is industry-level wage and the next is industry-level working hour, ages and sex in the decling order in. This suggests that the utilization of the HLM is appropriate since the characteristics of industry is important in determining the individual wage. Third, it is shown that the HLM model is the best compared to the other models which do not take level-1 and level-2 variables simultaneously into account.

A Study on Empirical Model for the Prevention and Protection of Technology Leakage through SME Profiling Analysis (중소기업 프로파일링 분석을 통한 기술유출 방지 및 보호 모형 연구)

  • Yoo, In-Jin;Park, Do-Hyung
    • The Journal of Information Systems
    • /
    • v.27 no.1
    • /
    • pp.171-191
    • /
    • 2018
  • Purpose Corporate technology leakage is not only monetary loss, but also has a negative impact on the corporate image and further deteriorates sustainable growth. In particular, since SMEs are highly dependent on core technologies compared to large corporations, loss of technology leakage threatens corporate survival. Therefore, it is important for SMEs to "prevent and protect technology leakage". With the recent development of data analysis technology and the opening of public data, it has become possible to discover and proactively detect companies with a high probability of technology leakage based on actual company data. In this study, we try to construct profiles of enterprises with and without technology leakage experience through profiling analysis using data mining techniques. Furthermore, based on this, we propose a classification model that distinguishes companies that are likely to leak technology. Design/methodology/approach This study tries to develop the empirical model for prevention and protection of technology leakage through profiling method which analyzes each SME from the viewpoint of individual. Based on the previous research, we tried to classify many characteristics of SMEs into six categories and to identify the factors influencing the technology leakage of SMEs from the enterprise point of view. Specifically, we divided the 29 SME characteristics into the following six categories: 'firm characteristics', 'organizational characteristics', 'technical characteristics', 'relational characteristics', 'financial characteristics', and 'enterprise core competencies'. Each characteristic was extracted from the questionnaire data of 'Survey of Small and Medium Enterprises Technology' carried out annually by the Government of the Republic of Korea. Since the number of SMEs with experience of technology leakage in questionnaire data was significantly smaller than the other, we made a 1: 1 correspondence with each sample through mixed sampling. We conducted profiling of companies with and without technology leakage experience using decision-tree technique for research data, and derived meaningful variables that can distinguish the two. Then, empirical model for prevention and protection of technology leakage was developed through discriminant analysis and logistic regression analysis. Findings Profiling analysis shows that technology novelty, enterprise technology group, number of intellectual property registrations, product life cycle, technology development infrastructure level(absence of dedicated organization), enterprise core competency(design) and enterprise core competency(process design) help us find SME's technology leakage. We developed the two empirical model for prevention and protection of technology leakage in SMEs using discriminant analysis and logistic regression analysis, and each hit ratio is 65%(discriminant analysis) and 67%(logistic regression analysis).

Development of severity-adjusted length of stay in knee replacement surgery (무릎관절치환술 환자의 중증도 보정 재원일수 모형 개발)

  • Hong, Sung-Ok;Kim, Young-Teak;Choi, Youn-Hee;Park, Jong-Ho;Kang, Sung-Hong
    • Journal of Digital Convergence
    • /
    • v.13 no.2
    • /
    • pp.215-225
    • /
    • 2015
  • This study was conducted to develop a severity-adjusted LOS(Length of Stay) model for knee replacement patients and identify factors that can influence the LOS by using the Korean National Hospital Discharge in-depth Injury Survey data. The comorbidity scoring systems and data-mining methods were used to design a severity-adjusted LOS model which covered 4,102 knee replacement patients. In this study, a decision tree model using CCS comorbidity scoring index was chosen for the final model that produced superior results. Factors such as presence of arthritis, patient sex and admission route etc. influenced patient length of stay. And there was a statistically significant difference between real LOS and adjusted LOS resulted from health-insurance type, bed size, and hospital location. Therefore the policy alternative on excessive medical utilization is needed to reduce variation in length of hospital stay in patients who undergo knee replacement.

Affected Model of Indoor Radon Concentrations Based on Lifestyle, Greenery Ratio, and Radon Levels in Groundwater (생활 습관, 주거지 주변 녹지 비율 및 지하수 내 라돈 농도 따른 실내 라돈 농도 영향 모델)

  • Lee, Hyun Young;Park, Ji Hyun;Lee, Cheol-Min;Kang, Dae Ryong
    • Journal of health informatics and statistics
    • /
    • v.42 no.4
    • /
    • pp.309-316
    • /
    • 2017
  • Objectives: Radon and its progeny pose environmental risks as a carcinogen, especially to the lungs. Investigating factors affecting indoor radon concentrations and models thereof are needed to prevent exposure to radon and to reduce indoor radon concentrations. The purpose of this study was to identify factors affecting indoor radon concentration and to construct a comprehensive model thereof. Methods: Questionnaires were administered to obtain data on residential environments, including building materials and life style. Decision tree and structural equation modeling were applied to predict residences at risk for higher radon concentrations and to develop the comprehensive model. Results: Greenery ratio, impermeable layer ratio, residence at ground level, daily ventilation, long-term heating, crack around the measuring device, and bedroom were significantly shown to be predictive factors of higher indoor radon concentrations. Daily ventilation reduced the probability of homes having indoor radon concentrations ${\geq}200Bq/m^3$ by 11.6%. Meanwhile, a greenery ratio ${\geq}65%$ without daily ventilation increased this probability by 15.3% compared to daily ventilation. The constructed model indicated greenery ratio and ventilation rate directly affecting indoor radon concentrations. Conclusions: Our model highlights the combined influences of geographical properties, groundwater, and lifestyle factors of an individual resident on indoor radon concentrations in Korea.

Development of prediction model identifying high-risk older persons in need of long-term care (장기요양 필요 발생의 고위험 대상자 발굴을 위한 예측모형 개발)

  • Song, Mi Kyung;Park, Yeongwoo;Han, Eun-Jeong
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.4
    • /
    • pp.457-468
    • /
    • 2022
  • In aged society, it is important to prevent older people from being disability needing long-term care. The purpose of this study is to develop a prediction model to discover high-risk groups who are likely to be beneficiaries of Long-Term Care Insurance. This study is a retrospective study using database of National Health Insurance Service (NHIS) collected in the past of the study subjects. The study subjects are 7,724,101, the population over 65 years of age registered for medical insurance. To develop the prediction model, we used logistic regression, decision tree, random forest, and multi-layer perceptron neural network. Finally, random forest was selected as the prediction model based on the performances of models obtained through internal and external validation. Random forest could predict about 90% of the older people in need of long-term care using DB without any information from the assessment of eligibility for long-term care. The findings might be useful in evidencebased health management for prevention services and can contribute to preemptively discovering those who need preventive services in older people.