Search | Korea Science

Tree Based Cluster Analysis Using Reference Data (배경자료를 이용한 나무구조의 군집분석)

최대우;구자용;최용석
- The Korean Journal of Applied Statistics
- /
- v.17 no.3
- /
- pp.535-545
- /
- 2004
The clustering method suggested in this paper produces clusters based on the 'rules of variables' by merging the 'training' and the identically structured reference data and then by filtering it to obtain the clusters of the 'training data' through the use of the 'tree classification model'. The reference dataset is generated by spatially contrasting it to the 'training data' through the 'reverse arcing' algorithm to effectively identify the clusters. The strength of this method is that it can be applied even to the mixture of continuous and discrete types of 'training data' and the performance of this algorithm is illustrated by applying it to the simulated data as well as to the actual data.
https://doi.org/10.5351/KJAS.2004.17.3.535 인용 PDF KSCI

Drivers Detour Decision Factor Analysis with Combined Method of Decision Tree and Neural Network Algorithm (의사결정나무와 신경망 모형 결합에 의한 운전자 우회결정요인 분석)

Kang, Jin-Woong;Kum, Ki-Jung;Son, Seung-Neo
- International Journal of Highway Engineering
- /
- v.13 no.3
- /
- pp.167-176
- /
- 2011
This study's purpose is to analyse factors of determination about detouring for makinga standard model in regard of unfavorableness and uncertainty when unspecified individual recipients make a decision at the time of course detour. In order to achieve this, we surveyed SP investigation whether making a detour or not for drivers as a target who take a high way and National highway. Based on this result, we analysed detour determination factors of drivers, establishing a combination model of Decision Tree and Neural Network model. The result demonstrates the effected factors on drivers' detour determination are in ordering of the recognition of alternative routevs, reliable and frequency of using traffic information, frequency of transition routes and age. Moreover, from the outcome in comparison with an existing model and prediction through undistributed data, the rate of combination model 8.7% illustrates the most predictable way in contrast with logit model 12.8%, and Individual Model of Decision Tree 13.8% which are existed. This reveals that the analysis of drivers' detour determination factors is valid to apply. Hence, overall study considers as a practical foundation to make effective detour strategies for increasing the utility of route networking and dispersion in the volume of traffic from now on.
https://doi.org/10.7855/IJHE.2011.13.3.167 인용 PDF KSCI

Symbolic tree based model for HCC using SNP data (악성간암환자의 유전체자료 심볼릭 나무구조 모형연구)

Lee, Tae Rim
- Journal of the Korean Data and Information Science Society
- /
- v.25 no.5
- /
- pp.1095-1106
- /
- 2014
Symbolic data analysis extends the data mining and exploratory data analysis to the knowledge mining, we can suggest the SDA tree model on clinical and genomic data with new knowledge mining SDA approach. Using SDA application for huge genomic SNP data, we can get the correlation the availability of understanding of hidden structure of HCC data could be proved. We can confirm validity of application of SDA to the tree structured progression model and to quantify the clinical lab data and SNP data for early diagnosis of HCC. Our proposed model constructs the representative model for HCC survival time and causal association with their SNP gene data. To fit the simple and easy interpretation tree structured survival model which could reduced from huge clinical and genomic data under the new statistical theory of knowledge mining with SDA.
https://doi.org/10.7465/jkdi.2014.25.5.1095 인용 PDF KSCI

Recent Changes in Bloom Dates of Robinia pseudoacacia and Bloom Date Predictions Using a Process-Based Model in South Korea (최근 12년간 아까시나무 만개일의 변화와 과정기반모형을 활용한 지역별 만개일 예측)

Kim, Sukyung;Kim, Tae Kyung;Yoon, Sukhee;Jang, Keunchang;Lim, Hyemin;Lee, Wi Young;Won, Myoungsoo;Lim, Jong-Hwan;Kim, Hyun Seok
- Journal of Korean Society of Forest Science
- /
- v.110 no.3
- /
- pp.322-340
- /
- 2021
Due to climate change and its consequential spring temperature rise, flowering time of Robinia pseudoacacia has advanced and a simultaneous blooming phenomenon occurred in different regions in South Korea. These changes in flowering time became a major crisis in the domestic beekeeping industry and the demand for accurate prediction of flowering time for R. pseudoacacia is increasing. In this study, we developed and compared performance of four different models predicting flowering time of R. pseudoacacia for the entire country: a Single Model for the country (SM), Modified Single Model (MSM) using correction factors derived from SM, Group Model (GM) estimating parameters for each region, and Local Model (LM) estimating parameters for each site. To achieve this goal, the bloom date data observed at 26 points across the country for the past 12 years (2006-2017) and daily temperature data were used. As a result, bloom dates for the north central region, where spring temperature increase was more than two-fold higher than southern regions, have advanced and the differences compared with the southwest region decreased by 0.7098 days per year (p-value=0.0417). Model comparisons showed MSM and LM performed better than the other models, as shown by 24% and 15% lower RMSE than SM, respectively. Furthermore, validation with 16 additional sites for 4 years revealed co-krigging of LM showed better performance than expansion of MSM for the entire nation (RMSE: p-value=0.0118, Bias: p-value=0.0471). This study improved predictions of bloom dates for R. pseudoacacia and proposed methods for reliable expansion to the entire nation.
https://doi.org/10.14578/jkfs.2021.110.3.322 인용 PDF KSCI

Study on Development of Classification Model and Implementation for Diagnosis System of Sasang Constitution (사상체질 분류모형 개발 및 진단시스템의 구현에 관한 연구)

Beum, Soo-Gyun;Jeon, Mi-Ran;Oh, Am-Suk
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2008.08a
- /
- pp.155-159
- /
- 2008
In this thesis, in order to develop a new classification model of Sasang Constitutional medical types, which is helpful for improving the accuracy of diagnosis of medical types. various data-mining classification models such as discriminant analysis. decision trees analysis, neural networks analysis, logistics regression analysis, clustering analysis which are main classification methods were applied to the questionnaires of medical type classification. In this manner, a model which scientifically classifies constitutional medical types in the field of Sasang Constitutional Medicine, one of a traditional Korean medicine, has been developed. Also, the above-mentioned analysis models were systematically compared and analyzed. In this study, a classification of Sasang constitutional medical types was developed based on the discriminate analysis model and decision trees analysis model of which accuracy is relatively high, of which analysis procedure is easy to understand and to explain and which are easy to implement. Also, a diagnosis system of Sasang constitution was implemented applying the two analysis models.
PDF

머신러닝 기반 KOSDAQ 시장의 관리종목 지정 예측 연구

Yun, Yang-Hyeon;Kim, Tae-Gyeong;Kim, Su-Yeong;Park, Yong-Gyun
- 한국벤처창업학회:학술대회논문집
- /
- 2021.11a
- /
- pp.185-187
- /
- 2021
관리종목 지정 제도는 상장 기업 내 기업의 부실화를 경고하여 기업에게는 회생 기회를 주고, 투자자들에게는 투자 위험을 경고하기 위한 시장규제 제도이다. 본 연구는 관리종목과 비관리종목의 기업의 재무 데이터를 표본으로 하여 관리종목 지정 예측에 대한 연구를 진행하였다. 분석에 쓰인 분석 방법은 로지스틱 회귀분석, 의사결정나무, 서포트 벡터 머신, 소프트 보팅, 랜덤 포레스트, LightGBM이며 분류 정확도가 82.73%인 LightGBM이 가장 우수한 예측 모형이었으며 분류 정확도가 가장 낮은 예측 모형은 정확도가 71.94%인 의사결정나무였다. 대체적으로 앙상블을 이용한 학습 모형이 단일 학습 모형보다 예측 성능이 높았다.
PDF

A study on removal of unnecessary input variables using multiple external association rule (다중외적연관성규칙을 이용한 불필요한 입력변수 제거에 관한 연구)

Cho, Kwang-Hyun;Park, Hee-Chang
- Journal of the Korean Data and Information Science Society
- /
- v.22 no.5
- /
- pp.877-884
- /
- 2011
The decision tree is a representative algorithm of data mining and used in many domains such as retail target marketing, fraud detection, data reduction, variable screening, category merging, etc. This method is most useful in classification problems, and to make predictions for a target group after dividing it into several small groups. When we create a model of decision tree with a large number of input variables, we suffer difficulties in exploration and analysis of the model because of complex trees. And we can often find some association exist between input variables by external variables despite of no intrinsic association. In this paper, we study on the removal method of unnecessary input variables using multiple external association rules. And then we apply the removal method to actual data for its efficiencies.
PDF KSCI

The impact of the change in the splitting method of decision trees on the prediction power (의사결정나무의 분기법 변화가 예측력에 미치는 영향)

Chang, Youngjae
- The Korean Journal of Applied Statistics
- /
- v.35 no.4
- /
- pp.517-525
- /
- 2022
In the era of big data, various data mining techniques have been proposed as major analysis methodologies. As complex and diverse data is mass-produced, data mining techniques have attracted attention as a method that forms the foundation of data science. In this paper, we focused on the decision tree, which is frequently used in practice and easy to understand as one of representative data mining methods. Specifically, we analyzed the effect of the splitting method of decision trees on the model performance. We compared the prediction power and structures of decision tree models with different split methods based on various simulated data. The results show that the linear combination split method can improve the prediction accuracy of decision trees in the case of data simulated from nonlinear models with complex structure.
https://doi.org/10.5351/KJAS.2022.35.4.517 인용 PDF KSCI

Predicting Site Quality by Partial Least Squares Regression Using Site and Soil Attributes in Quercus mongolica Stands (신갈나무 임분의 입지 및 토양 속성을 이용한 부분최소제곱 회귀의 지위추정 모형)

Choonsig Kim;Gyeongwon Baek;Sang Hoon Chung;Jaehong Hwang;Sang Tae Lee
- Journal of Korean Society of Forest Science
- /
- v.112 no.1
- /
- pp.23-31
- /
- 2023
Predicting forest productivity is essential to evaluate sustainable forest management or to enhance forest ecosystem services. Ordinary least squares (OLS) and partial least squares (PLS) regression models were used to develop predictive models for forest productivity (site index) from the site characteristics and soil profile, along with soil physical and chemical properties, of 112 Quercus mongolica stands. The adjusted coefficients of determination (adjusted R²) in the regression models were higher for the site characteristics and soil profile of B horizon (R²=0.32) and of A horizon (R²=0.29) than for the soil physical and chemical properties of B horizon (R²=0.21) and A horizon (R²=0.09). The PLS models (R²=0.20-0.32) were better predictors of site index than the OLS models (R²=0.09-0.31). These results suggest that the regression models for Q. mongolica can be applied to predict the forest productivity, but new variables may need to be developed to enhance the explanatory power of regression models.
https://doi.org/10.14578/jkfs.2023.112.1.23 인용 PDF HTML

Design and Evaluation of ANFIS-based Classification Model (ANFIS 기반 분류모형의 설계 및 성능평가)

Song, Hee-Seok;Kim, Jae-Kyeong
- Journal of Intelligence and Information Systems
- /
- v.15 no.3
- /
- pp.151-165
- /
- 2009
Fuzzy neural network is an integrated model of artificial neural network and fuzzy system and it has been successfully applied in control and forecasting area. Recently ANFIS(Adaptive Network-based Fuzzy Inference System) has been noticed widely among various fuzzy neural network models because of its outstanding accuracy of control and forecasting area. We design a new classification model based on ANFIS and evaluate it in terms of classification accuracy. We identified ANFIS-based classification model has higher classification accuracy compared to existing classification model, C5.0 decision tree model by comparing their experimental results.
PDF

Search Result 342, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)