Search | Korea Science

Kim, Min-Ho;Kim, Jin-Heum
- Proceedings of the Korean Statistical Society Conference
- /
- 2003.10a
- /
- pp.263-268
- /
- 2003
Breiman, Friedman, Olshen and Stone(1984)의 전체탐색법에 의한 회귀나무는 상대적으로 많은 분리가 가능한 변수로 분리기준이 정해지는 편의 현상을 갖고 있다. 본 연구에서는 이런 문제점을 해결할 수 있는 알고리즘을 제안하여 변수선택편의가 없는 회귀나무를 만들고자 한다. 제안하는 알고리즘은 노드의 분리변수를 선택하는 단계와 그 선택된 변수에 의해 이진분리를 위한 분리점을 찾는 단계로 구성되어 있다. 예측변수 중에서 목표변수와 가장 밀접하게 연관된 예측변수는 예측변수의 자료의 종류에 따라 스피어만의 순위상관계수에 의한 검정 혹은 크루스칼-왈리스의 통계량에 의한 검정을 수행하여 가장 통계적으로 유의한 변수로 선택하였고, 선택된 변수에만 Breiman et al.(1984)의 전체선택법을 적용하여 분리점을 결정하였다. 모의실험을 통해 변수선택편의, 변수선택력 , 그리고 평균제곱오차 측면에서 Breiman et al. (1984)의 CART(Classification and Regression Trees)와 제안한 알고리즘을 서로 비교하였다. 또한, 두 알고리즘을 실제 자료에 적용하여 효율을 서로 비교하였다.
PDF

Lee, Ju-Young
- Journal of the Korea Society of Computer and Information
- /
- v.19 no.7
- /
- pp.131-140
- /
- 2014
In this paper, we present an efficient implementation of Kruskal's algorithm to obtain a minimum spanning tree. The proposed method utilizes the union-find data structure, reducing the depth of the tree of the node set by making the nodes in the path to root be the child node of the root of combined tree. This method can reduce the depth of the tree by shortening the path to the root and lowering the level of the node. This is an efficient method because if the tree's depth reduces, it could shorten the time of finding the root of the tree to which the node belongs. The performance of the proposed method is evaluated through the graphs generated randomly. The results showed that the proposed method outperformed the conventional method in terms of the depth of the tree.
https://doi.org/10.9708/jksci.2014.19.7.131 인용 PDF KSCI

김진흠;김민호
- The Korean Journal of Applied Statistics
- /
- v.17 no.3
- /
- pp.459-473
- /
- 2004
It has well known that an exhaustive search algorithm suggested by Breiman et. a1.(1984) has a trend to select the variable having relatively many possible splits as an splitting rule. We propose an algorithm to overcome this variable selection bias problem and then construct unbiased regression trees based on the algorithm. The proposed algorithm runs two steps of selecting a split variable and determining a split rule for binary split based on the split variable. Simulation studies were performed to compare the proposed algorithm with Breiman et a1.(1984)'s CART(Classification and Regression Tree) in terms of degree of variable selection bias, variable selection power, and MSE(Mean Squared Error). Also, we illustrate the proposed algorithm with real data sets.
https://doi.org/10.5351/KJAS.2004.17.3.459 인용 PDF KSCI

Park Mee-Jeong;Heo Hyun;Kim Tae-Gon;Suh Kyo;Lee Jeong-Jae
- Journal of The Korean Society of Agricultural Engineers
- /
- v.48 no.4
- /
- pp.3-12
- /
- 2006
Watershed is the land area that contributes runoff to an outlet point. To delineate an watershed, watershed delineation using GIS that contains grid data structure is the most general method. Some researchers have studied to implement algorithms that revise the TIN topography since it is difficult to delineate watershed boundary more accurately. In this study kruskal's greedy algorithm and triangulated irregular network (TIN) were used to delineate a watershed. This method does not require a conversion from to DEM in grid and automatically obtain(generates) the oulet points. Delineation algorithm was tested in Geosan-gun, Chung-cheongbuk-do and get small watershed areas. Finally, kruskal's algorithm could operate more precisely with revision algorithm.
https://doi.org/10.5389/KSAE.2006.48.4.003 인용 PDF KSCI