• Title/Summary/Keyword: Tree classification

Search Result 939, Processing Time 0.024 seconds

A Study on the Walkability Scores in Jeonju City Using Multiple Regression Models (다중 회귀 모델을 이용한 전주시 보행 환경 점수 예측에 관한 연구)

  • Lee, KiChun;Nam, KwangWoo;Lee, ChangWoo
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.27 no.4
    • /
    • pp.1-10
    • /
    • 2022
  • Attempts to interpret human perspectives using computer vision have been developed in various fields. In this paper, we propose a method for evaluating the walking environment through semantic segmentation results of images from road images. First, the Kakao Map API was used to collect road images, and four-way images were collected from about 50,000 points in JeonJu. 20% of the collected images build datasets through crowdsourcing-based paired comparisons, and train various regression models using paired comparison data. In order to derive the walkability score of the image data, the ranking score is calculated using the Trueskill algorithm, which is a ranking algorithm, and the walkability and analysis using various regression models are performed using the constructed data. Through this study, it is shown that the walkability of Jeonju can be evaluated and scores can be derived through the correlation between pixel distribution classification information rather than human vision.

Probability Estimation Method for Imputing Missing Values in Data Expansion Technique (데이터 확장 기법에서 손실값을 대치하는 확률 추정 방법)

  • Lee, Jong Chan
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.11
    • /
    • pp.91-97
    • /
    • 2021
  • This paper uses a data extension technique originally designed for the rule refinement problem to handling incomplete data. This technique is characterized in that each event can have a weight indicating importance, and each variable can be expressed as a probability value. Since the key problem in this paper is to find the probability that is closest to the missing value and replace the missing value with the probability, three different algorithms are used to find the probability for the missing value and then store it in this data structure format. And, after learning to classify each information area with the SVM classification algorithm for evaluation of each probability structure, it compares with the original information and measures how much they match each other. The three algorithms for the imputation probability of the missing value use the same data structure, but have different characteristics in the approach method, so it is expected that it can be used for various purposes depending on the application field.

Development of a Model for Calculating the Negligence Ratio Using Traffic Accident Information (교통사고 정보를 이용한 과실비율 산정 모델 개발)

  • Eum Han;Giok Park;Heejin Kang;Yoseph Lee;Ilsoo Yun
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.21 no.6
    • /
    • pp.36-56
    • /
    • 2022
  • Traffic accidents occur in Korea are calculated with the 「Automobile Accident Negligence Ratio Certification Standard」 prepared by the 'General Insurance Association of Korea' and the insurance company's agreement or judgment is made. However, disputes are frequently occurring in calculating the negligence ratio. Therefore, it is thought that a more effective response would be possible if accident type according to the standard could be quickly identified using traffic accident information prepared by police. Therefore, this study aims to develop a model that learns the accident information prepared by the police and classifies it to match the accident type in the standard. In particular, through data mining, keywords necessary to classify the accident types of the standard were extracted from the accident data of the police. Then, models were developed to derive the types of accidents by learning the extracted keywords through decision trees and random forest models.

Estimation of unused forest biomass potential resource amount in Korea

  • Sangho Yun;Sung-Min Choi;Joon-Woo Lee;Sung-Min Park
    • Korean Journal of Agricultural Science
    • /
    • v.49 no.2
    • /
    • pp.317-330
    • /
    • 2022
  • Recently, the policy regarding climate change in Korea and overseas has been to promote the utilization of forest biomass to achieve net zero emissions. In addition, with the implementation of the unused forest biomass system in 2018, the size of the Korean market for manufacturing wood pellets and wood chips using unused forest biomass is rapidly expanding. Therefore, it is necessary to estimate the total amount of unused forest biomass that can be used as an energy source and to identify the capacity that can be continuously produced annually. In this study, we estimated the actual forest area that can be produced of logging residue and the potential amount of unused forest biomass resources based on GT (green ton). Using a forest functions classification map (1 : 25,000), 5th digital forest type map (1 : 25,000), and digital elevation model (DEM), the forest area with a slope of 30° or less and mountain ridges of 70% or less was estimated based on production forest and IV age class or more. The total forest area where unused forest biomass can be produced was estimated to be 1,453,047 ha. Based on GT, the total amount of unused forest biomass potential resources in Korea was estimated to be 117,741,436 tons. By forest type, coniferous forests were estimated to be 48,513,580 tons (41.2%), broad-leaved forests 27,419,391 tons (23.3%), and mixed forests 41,808,465 tons (35.5%). Data from this research analysis can be used as basic data to estimate commercial use of unused forest biomass.

Feasibility on Statistical Process Control Analysis of Delivery Quality Assurance in Helical Tomotherapy (토모테라피에서 선량품질보증 분석을 위한 통계적공정관리의 타당성)

  • Kyung Hwan, Chang
    • Journal of radiological science and technology
    • /
    • v.45 no.6
    • /
    • pp.491-502
    • /
    • 2022
  • The purpose of this study was to retrospectively investigate the upper and lower control limits of treatment planning parameters using EBT film based delivery quality assurance (DQA) results and to analyze the results of statistical process control (SPC) in helical tomotherapy (HT). A total of 152 patients who passed or failed DQA results were retrospectively included in this study. Prostate (n = 66), rectal (n = 51), and large-field cancer patients, including lymph nodes (n = 35), were randomly selected. The absolute point dose difference (DD) and global gamma passing rate (GPR) were analyzed for all patients. Control charts were used to evaluate the upper and lower control limits (UCL and LCL) for all the assessed treatment planning parameters. Treatment planning parameters such as gantry period, leaf open time (LOT), pitch, field width, actual and planning modulation factor, treatment time, couch speed, and couch travel were analyzed to provide the optimal range using the DQA results. The classification and regression tree (CART) was used to predict the relative importance of variables in the DQA results from various treatment planning parameters. We confirmed that the proportion of patients with an LOT below 100 ms in the failure group was relatively higher than that in the passing group. SPC can detect QA failure prior to over dosimetric QA tolerance levels. The acceptable tolerance range of each planning parameter may assist in the prediction of DQA failures using the SPC tool in the future.

Exploring On-line Consumption Tendency of Sports 4.0 Market Consumer: Focused on Sports Goods Consumption by Generation of Working Age Population (스포츠 4.0 시장 소비자의 온라인 소비성향 탐색: 생산 가능인구의 세대별 스포츠 용품 소비를 중심으로)

  • Jin-Ho Shin
    • Journal of the Korean Applied Science and Technology
    • /
    • v.40 no.1
    • /
    • pp.24-34
    • /
    • 2023
  • This study sought to explore the online consumption propensity of sports goods by generation of the productive population and to provide basic data to predict the future consumption market by segmenting online consumers in the sports 4.0 market. Therefore, this survey was conducted on those who consumed sports goods among the generation-specific groups (Generation Y and above, Z) of the productive population, and a total of 478 people's data were applied to the final analysis. Data processing was conducted with SPSS statistics (ver.21.0), frequency analysis, exploratory factor analysis, correlation analysis of re-examination reliability, reliability analysis, and decision tree analysis. According to the online consumption propensity of sports goods by generation of the productive population, there is a high probability of being classified as Generation Z group if the factors of leisure, joy, and environment are high. In addition, the classification accuracy of such a model was 69.7%.

A Comparative Study of Prediction Models for College Student Dropout Risk Using Machine Learning: Focusing on the case of N university (머신러닝을 활용한 대학생 중도탈락 위험군의 예측모델 비교 연구 : N대학 사례를 중심으로)

  • So-Hyun Kim;Sung-Hyoun Cho
    • Journal of The Korean Society of Integrative Medicine
    • /
    • v.12 no.2
    • /
    • pp.155-166
    • /
    • 2024
  • Purpose : This study aims to identify key factors for predicting dropout risk at the university level and to provide a foundation for policy development aimed at dropout prevention. This study explores the optimal machine learning algorithm by comparing the performance of various algorithms using data on college students' dropout risks. Methods : We collected data on factors influencing dropout risk and propensity were collected from N University. The collected data were applied to several machine learning algorithms, including random forest, decision tree, artificial neural network, logistic regression, support vector machine (SVM), k-nearest neighbor (k-NN) classification, and Naive Bayes. The performance of these models was compared and evaluated, with a focus on predictive validity and the identification of significant dropout factors through the information gain index of machine learning. Results : The binary logistic regression analysis showed that the year of the program, department, grades, and year of entry had a statistically significant effect on the dropout risk. The performance of each machine learning algorithm showed that random forest performed the best. The results showed that the relative importance of the predictor variables was highest for department, age, grade, and residence, in the order of whether or not they matched the school location. Conclusion : Machine learning-based prediction of dropout risk focuses on the early identification of students at risk. The types and causes of dropout crises vary significantly among students. It is important to identify the types and causes of dropout crises so that appropriate actions and support can be taken to remove risk factors and increase protective factors. The relative importance of the factors affecting dropout risk found in this study will help guide educational prescriptions for preventing college student dropout.

Prediction of Correct Answer Rate and Identification of Significant Factors for CSAT English Test Based on Data Mining Techniques (데이터마이닝 기법을 활용한 대학수학능력시험 영어영역 정답률 예측 및 주요 요인 분석)

  • Park, Hee Jin;Jang, Kyoung Ye;Lee, Youn Ho;Kim, Woo Je;Kang, Pil Sung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.11
    • /
    • pp.509-520
    • /
    • 2015
  • College Scholastic Ability Test(CSAT) is a primary test to evaluate the study achievement of high-school students and used by most universities for admission decision in South Korea. Because its level of difficulty is a significant issue to both students and universities, the government makes a huge effort to have a consistent difficulty level every year. However, the actual levels of difficulty have significantly fluctuated, which causes many problems with university admission. In this paper, we build two types of data-driven prediction models to predict correct answer rate and to identify significant factors for CSAT English test through accumulated test data of CSAT, unlike traditional methods depending on experts' judgments. Initially, we derive candidate question-specific factors that can influence the correct answer rate, such as the position, EBS-relation, readability, from the annual CSAT practices and CSAT for 10 years. In addition, we drive context-specific factors by employing topic modeling which identify the underlying topics over the text. Then, the correct answer rate is predicted by multiple linear regression and level of difficulty is predicted by classification tree. The experimental results show that 90% of accuracy can be achieved by the level of difficulty (difficult/easy) classification model, whereas the error rate for correct answer rate is below 16%. Points and problem category are found to be critical to predict the correct answer rate. In addition, the correct answer rate is also influenced by some of the topics discovered by topic modeling. Based on our study, it will be possible to predict the range of expected correct answer rate for both question-level and entire test-level, which will help CSAT examiners to control the level of difficulties.

A Study on the Changes of Land Use and Stand Volume around Mt. Kuem-O using Aerial Photographs (항공사진(航空寫眞)을 이용(利用)한 금오산(金烏山) 지역(地域)의 토지이용(土地利用) 및 임분재적(林分材積)의 변화(變化)에 관(關)한 연구(硏究))

  • Oh, Dong Ha;Kim, Kap Duk
    • Journal of Korean Society of Forest Science
    • /
    • v.79 no.4
    • /
    • pp.388-397
    • /
    • 1990
  • This study was conducted to investigate the changes of land use and stand volume around Mt. Kuem-O by B/W aerial photographs in 1979 and B/W Infrared aerial photographs in 1988. The results obtained in this study were as follow : 1. In classification of forest type on aerial photographs, coniferous stand was dark tone and hardwood stand was light tone and irregularly rounded crowns. 2. In classification of coniferous stand, Pinus densiflora was narraw cone and rounded tip of crowns and rough texture, Pinus rigida was irregulary rounded and broadly conical crowns. 3. To refer to changes of forest land area, mixed forest was changed into P. desiflora (687ha), P. rigida (130ha) and hardwood stand (219ha). 4. The regression equations between crown diameter and DBH were significant at 1% level by F-test in all stands. So the equation, D=a+bCD was used to estimate DBH. 5. The tree height curve equations were significant at 1% level by F-test in all stands. To estimate tree height the equation, logH=loga+blogD was adopted in P. densiflora and L. leptolepis and $H=a-bD+cD^2$ was adopted in P. rigida, hardwood stand and mixed forest. 6. The highest volume per hectare was observed in L. leptolepis and mixed forest showed the greatest growth percentage, while the lowest volume per hectare and growth percentage were observed in hardwood stand.

  • PDF

Interpretation Method of Eco-Cultural Resources from the Perspective of Landscape Ecology in Jeju Olle Trail (제주 올레길 생태문화자원 경관생태학적 해석기법 연구)

  • Hur, Myung-Jin;Han, Bong-Ho;Park, Seok-Cheol
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.49 no.2
    • /
    • pp.128-140
    • /
    • 2021
  • This study applied the theory of Landscape Ecology to representative resources of Jeju Olle-gil, which is a representative subject of walking tourism, to identify ecological characteristics and to establish a technique for landscape ecological analysis of Olle-gil resources. Jeju Olle Trail type based on the biotope type, major land use, vegetation status around Olle Trail and roads were divided into 12 types. Based on the type of ecological tourism resource classification, the Jeju Olle-gil walking tourism resource classification was divided into seven types of natural resources and seven types of humanities resources, and each resource was characterized by Geotope, Biotope, and Anthropopope, just like the landscape ecology system. Geotope resources are strong in landscape characteristics such as coast and beach, rocks, bedrocks, waterfalls, geology and Jusangjeolli Cliff, Oreum and craters, water resources, and landscape viewpoints. The Biotope resources showed strong ecological characteristics due to large tree and protected tree, Gotjawal, forest road and vegetation communities, biological habitat, vegetation landscape view point. Antropotope include Culture of Jeju Haenyeo and traditional culture, potting and lighthouses, experience facilities, temples and churches, military and beacon facilities, other historical and cultural facilities, and cultural landscape views. Jeju Olle Trail The representative resources for each type of Jeju Olle Trail are coastal, Oreum, Gotjawal, field and Stonewall Fencing farming land, Jeju Village and Stone wall of Jeju. In order to learn about the components and various functions of the resources representing the Olle Trail's ecological culture, the landscape ecological technique was interpreted. Looking at the ecological and cultural characteristics of coastal, the coast includes black basalt rocks, coastal vegetation, coastal grasslands, coastal rock vegetation, winter migratory birds and Jeju haenyeo. Oreum is a unique volcanic topography, which includes circular and oval mountain bodies, oreum vegetation, crater wetlands, the origin and legend of the name of Oreum, the legend of the name of Oreum, the culture of grazing horses, the use of military purposes, the object of folk belief, and the view from the summit. Gotjawal features rocky bumps, unique microclimate formation, Gotjawal vegetation, geographical names, the culture of charcoal being baked in the past, and bizarre shapes of trees and vines. Field walls include the structure and shape of field walls, field cultivation crops, field wall habitats, Jeju agricultural culture, and field walls. The village includes a stone wall and roof structure built from basalt, a pavilion at the entrance of the village, a yard and garden inside the house, a view of the lives of local people, and an alleyway view. These resources have slowly changed with the long lives of humans, and are now unique to Jeju Island. By providing contents specialized for each type of Olle Trail, tourists who walk on Olle will be able to experience the Olle Trail in depth as they learn the story of the resources, and will be able to increase the sustainable use and satisfaction of Jeju Olle Trail users.