• Title/Summary/Keyword: classification trees

Search Result 317, Processing Time 0.033 seconds

Plant Community Structure of Abies holophylla Community from Sinseongam to Jungdaesa in Odaesan National Park (오대산국립공원 신성암~중대사 전나무림 식물군집구조 특성)

  • Kim, Dong-Wook;Han, Bong-Ho;Kim, Jong-Yup;Yeum, Jung-Hun
    • Korean Journal of Environment and Ecology
    • /
    • v.29 no.6
    • /
    • pp.895-906
    • /
    • 2015
  • This study was carried out to the structure of plant community from Sinseongam to Jungdaesa in Odaesan National Park, furthermore, it seeks to curate the basic data for planning of the Abies holophylla's forest management in Odaesan National Park. In order to identify the current ecological environment, this study explored the actual vegetation as primary research and set to twenty plots(i.e. $400m^2$) for analysing detailed structure of plant communities. The research methodology was qualitative analysis, therefore it used TWINSPAN and DCA analysis tools. Especially, TWINSPAN performed well in several comparisons of classification techniques, DCA is one of the ordination technique showed that the plant communities. The plant community was analysed classification and ordination by TWINSPAN and DCA, moreover it was analysed the structure of plant community such as importance percentage of woody species, DBH class distribution, the index of diversity and rate of sample tree growth. The main vegetation was A. holophylla-Quercus mongolica forest and Deciduous broad-leaved forest in the communities where located in low altitude and valley, whereas main vegetation where located in high altitude and slope was Q. mongolica forest. The research site's plant communities were classified four groups. In all of communities, A. holophylla was dominant species in main canopy layer, furthermore, the three communities (community I, II, III) are growing up next generation of A. holophylla excluding community IV. The communities (community I, II, III) can be sustained current status which dominates the A. holophylla communities, simultaneously, there might be expanded the Deciduous broad-leaved communities by Carpinus cordata, Betula schmidtii and so on. While, it showed that the community IV tended to be weaken the forces of A. holophylla, therefore the community IV can be transferred to C. cordata-Deciduous broad-leaved communities in the future. The age of sample trees was 79~128(i.e. A. holophylla), 75~87(i.e. Pinus koraiensis) and 190 years(i.e. Ulmus davidiana var. japonica). The index of Shannon's Species diversity (H') were ranged from 0.3889 to 1.3332 in the communities.

Development of Decision Tree Software and Protein Profiling using Surface Enhanced laser Desorption/lonization - Time of Flight - Mass Spectrometry (SELDI-TOF-MS) in Papillary Thyroid Cancer (의사결정트리 프로그램 개발 및 갑상선유두암에서 질량분석법을 이용한 단백질 패턴 분석)

  • Yoon, Joon-Kee;Lee, Jun;An, Young-Sil;Park, Bok-Nam;Yoon, Seok-Nam
    • Nuclear Medicine and Molecular Imaging
    • /
    • v.41 no.4
    • /
    • pp.299-308
    • /
    • 2007
  • Purpose: The aim of this study was to develop a bioinformatics software and to test it in serum samples of papillary thyroid cancer using mass spectrometry (SELDI-TOF-MS). Materials and Methods: Development of 'Protein analysis' software performing decision tree analysis was done by customizing C4.5. Sixty-one serum samples from 27 papillary thyroid cancer, 17 autoimmune thyroiditis, 17 controls were applied to 2 types of protein chips, CM10 (weak cation exchange) and IMAC3 (metal binding - Cu). Mass spectrometry was performed to reveal the protein expression profiles. Decision trees were generated using 'Protein analysis' software, and automatically detected biomarker candidates. Validation analysis was performed for CM10 chip by random sampling. Results: Decision tree software, which can perform training and validation from profiling data, was developed. For CM10 and IMAC3 chips, 23 of 113 and 8 of 41 protein peaks were significantly different among 3 groups (p<0.05), respectively. Decision tree correctly classified 3 groups with an error rate of 3.3% for CM10 and 2.0% for IMAC3, and 4 and 7 biomarker candidates were detected respectively. In 2 group comparisons, all cancer samples were correctly discriminated from non-cancer samples (error rate = 0%) for CM10 by single node and for IMAC3 by multiple nodes. Validation results from 5 test sets revealed SELDI-TOF-MS and decision tree correctly differentiated cancers from non-cancers (54/55, 98%), while predictability was moderate in 3 group classification (36/55, 65%). Conclusion: Our in-house software was able to successfully build decision trees and detect biomarker candidates, therefore it could be useful for biomarker discovery and clinical follow up of papillary thyroid cancer.

Product Recommender Systems using Multi-Model Ensemble Techniques (다중모형조합기법을 이용한 상품추천시스템)

  • Lee, Yeonjeong;Kim, Kyoung-Jae
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.39-54
    • /
    • 2013
  • Recent explosive increase of electronic commerce provides many advantageous purchase opportunities to customers. In this situation, customers who do not have enough knowledge about their purchases, may accept product recommendations. Product recommender systems automatically reflect user's preference and provide recommendation list to the users. Thus, product recommender system in online shopping store has been known as one of the most popular tools for one-to-one marketing. However, recommender systems which do not properly reflect user's preference cause user's disappointment and waste of time. In this study, we propose a novel recommender system which uses data mining and multi-model ensemble techniques to enhance the recommendation performance through reflecting the precise user's preference. The research data is collected from the real-world online shopping store, which deals products from famous art galleries and museums in Korea. The data initially contain 5759 transaction data, but finally remain 3167 transaction data after deletion of null data. In this study, we transform the categorical variables into dummy variables and exclude outlier data. The proposed model consists of two steps. The first step predicts customers who have high likelihood to purchase products in the online shopping store. In this step, we first use logistic regression, decision trees, and artificial neural networks to predict customers who have high likelihood to purchase products in each product group. We perform above data mining techniques using SAS E-Miner software. In this study, we partition datasets into two sets as modeling and validation sets for the logistic regression and decision trees. We also partition datasets into three sets as training, test, and validation sets for the artificial neural network model. The validation dataset is equal for the all experiments. Then we composite the results of each predictor using the multi-model ensemble techniques such as bagging and bumping. Bagging is the abbreviation of "Bootstrap Aggregation" and it composite outputs from several machine learning techniques for raising the performance and stability of prediction or classification. This technique is special form of the averaging method. Bumping is the abbreviation of "Bootstrap Umbrella of Model Parameter," and it only considers the model which has the lowest error value. The results show that bumping outperforms bagging and the other predictors except for "Poster" product group. For the "Poster" product group, artificial neural network model performs better than the other models. In the second step, we use the market basket analysis to extract association rules for co-purchased products. We can extract thirty one association rules according to values of Lift, Support, and Confidence measure. We set the minimum transaction frequency to support associations as 5%, maximum number of items in an association as 4, and minimum confidence for rule generation as 10%. This study also excludes the extracted association rules below 1 of lift value. We finally get fifteen association rules by excluding duplicate rules. Among the fifteen association rules, eleven rules contain association between products in "Office Supplies" product group, one rules include the association between "Office Supplies" and "Fashion" product groups, and other three rules contain association between "Office Supplies" and "Home Decoration" product groups. Finally, the proposed product recommender systems provides list of recommendations to the proper customers. We test the usability of the proposed system by using prototype and real-world transaction and profile data. For this end, we construct the prototype system by using the ASP, Java Script and Microsoft Access. In addition, we survey about user satisfaction for the recommended product list from the proposed system and the randomly selected product lists. The participants for the survey are 173 persons who use MSN Messenger, Daum Caf$\acute{e}$, and P2P services. We evaluate the user satisfaction using five-scale Likert measure. This study also performs "Paired Sample T-test" for the results of the survey. The results show that the proposed model outperforms the random selection model with 1% statistical significance level. It means that the users satisfied the recommended product list significantly. The results also show that the proposed system may be useful in real-world online shopping store.

Vegetation Characteristics of Ridge in the Seonunsan Provincial Park (선운산도립공원의 능선부 식생 특성)

  • Kang, Hyun-Mi;Park, Seok-Gon;Kim, Ji-Suk;Lee, Sang-Cheol;Choi, Song-Hyun
    • Korean Journal of Environment and Ecology
    • /
    • v.33 no.1
    • /
    • pp.75-85
    • /
    • 2019
  • The purpose of this study is to understand the vegetation characteristics of ridges (Gyeongsusan-Seonunsan-Gaeipalsan) in the Seonunsan Provincial Park and to establish reference information for the management of the park in the future. We designated 62 plots with the area of $100m^2$ were installed and analyzed them to investigate the vegetation characteristics. The results of community classification based on TWINSPAN showed seven categories of vegetation communities in the surveyed region: Quercus dentata-Deciduous broad-leaved Community, Quercus variabilis-Pinus thunbergii-Quercus serrata Community, Pinus densiflora Community, Deciduous broad-leaved Community-I, Carpinus tschonoskii-Castanea crenata-Quercus aliena Community, Deciduous broad-leaved Community-II, and Carpinus tschonoskii-Carpinus laxiflora Community. In the vegetation of Seonunsan Provincial Park, coniferous trees such as Pinus thunbergii and Pinus densiflora have been gradually losing their population as part of ecological succession to deciduous broad-leaved trees such as Quercus spp., Carpinus tschonoskii, and Carpinus laxiflora. Moreover, Carpinus turczaninowii, Mallotus japonicus, and others were identified as vegetation reflecting the geographical characteristics of the region neighboring the west coast. The estimated age is 30-60 years, and the oldest tree Pinus densiflora is 63-years old. The index of diversity ($100m^2$) was 0.7942 for Carpinus tschonoskii-Carpinus laxiflora Community, 0.8406 for Carpinus tschonoskii-Castanea crenata-Quercus aliena Community, 0.8543 for Quercus dentata-Deciduous broad-leaved Community, 0.9434 for Quercus variabilis-Pinus thunbergii-Quercus serrata Community, 0.9520 for Deciduous broad-leaved Community-I, 0.9633 for Pinus densiflora Community, and 1.0340 for Deciduous broad-leaved Community-II in the ascending order.

A prediction model for adolescents' skipping breakfast using the CART algorithm for decision trees: 7th (2016-2018) Korea National Health and Nutrition Examination Survey (의사결정나무 CART 알고리즘을 이용한 청소년 아침결식 예측 모형: 제7기 (2016-2018년) 국민건강영양조사 자료분석)

  • Sun A Choi;Sung Suk Chung;Jeong Ok Rho
    • Journal of Nutrition and Health
    • /
    • v.56 no.3
    • /
    • pp.300-314
    • /
    • 2023
  • Purpose: This study sought to predict the reasons for skipping breakfast by adolescents aged 13-18 years using the 7th Korea National Health and Nutrition Examination Survey (KNHANES). Methods: The participants included 1,024 adolescents. The data were analyzed using a complex-sample t-test, the Rao Scott χ2-test, and the classification and regression tree (CART) algorithm for decision tree analysis with SPSS v. 27.0. The participants were divided into two groups, one regularly eating breakfast and the other skipping it. Results: A total of 579 and 445 study participants were found to be breakfast consumers and breakfast skippers respectively. Breakfast consumers were significantly younger than those who skipped breakfast. In addition, breakfast consumers had a significantly higher frequency of eating dinner, had been taught about nutrition, and had a lower frequency of eating out. The breakfast skippers did so to lose weight. Children who skipped breakfast consumed less energy, carbohydrates, proteins, fats, fiber, cholesterol, vitamin C, vitamin A, calcium, vitamin B1, vitamin B2, phosphorus, sodium, iron, potassium, and niacin than those who consumed breakfast. The best predictor of skipping breakfast was identifying adolescents who sought to control their weight by not eating meals. Other participants who had low and middle-low household incomes, ate dinner 3-4 times a week, were more than 14.5 years old, and ate out once a day showed a higher frequency of skipping breakfast. Conclusion: Based on these results, nutrition education targeted at losing weight correctly and emphasizing the importance of breakfast, especially for adolescents, is required. Moreover, nutrition educators should consider designing and implementing specific action plans to encourage adolescents to improve their breakfast-eating practices by also eating dinner regularly and reducing eating out.

The guideline for choosing the right-size of tree for boosting algorithm (부스팅 트리에서 적정 트리사이즈의 선택에 관한 연구)

  • Kim, Ah-Hyoun;Kim, Ji-Hyun;Kim, Hyun-Joong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.5
    • /
    • pp.949-959
    • /
    • 2012
  • This article is to find the right size of decision trees that performs better for boosting algorithm. First we defined the tree size D as the depth of a decision tree. Then we compared the performance of boosting algorithm with different tree sizes in the experiment. Although it is an usual practice to set the tree size in boosting algorithm to be small, we figured out that the choice of D has a significant influence on the performance of boosting algorithm. Furthermore, we found out that the tree size D need to be sufficiently large for some dataset. The experiment result shows that there exists an optimal D for each dataset and choosing the right size D is important in improving the performance of boosting. We also tried to find the model for estimating the right size D suitable for boosting algorithm, using variables that can explain the nature of a given dataset. The suggested model reveals that the optimal tree size D for a given dataset can be estimated by the error rate of stump tree, the number of classes, the depth of a single tree, and the gini impurity.

A Comparative Study on the Natural Monument Management Policies of South and North Korea (남.북한의 천연기념물 관리제도 비교)

  • Na, Moung-Ha;Hong, Youn-Soon;Kim, Hak-Beom
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.35 no.2 s.121
    • /
    • pp.71-80
    • /
    • 2007
  • Korea began preserving and managing natural monuments in 1933 under Japanese Colonization, but North Korea and South Korea were forced to establish separate natural monument management policies because of the division after the Korean Independence. The purpose of this study is to compare and analyze the natural monument management policies of both south and North Korea between 1933 and 2005 to introduce new policies for Korea unification. The following are the results: First, South Korea manages every type of cultural asset, including natural monuments, through the 'Cultural Heritage Protection Act,' whereas North Korea managing its cultural assets through the 'Cultural Relics Protection Act' and the 'Landmark/Natural Monument Protection Act.' Second, South Korea preserves and utilizes natural monuments for the purpose of promoting the cultural experience of Korean people and contributing to the development of world culture, whereas North Korea uses its natural monuments to promote the superiority of socialism and protect its ruling power. Third, North and South Korea have similar classification systems for animals, plants, and geology, but North Korea classifies geography as one of its natural monuments. Unlike South Korea, North Korea also designates imported animals and plants not only for the preservation and research of genetic resources, but also for their value as economic resources. Fourth, North Korea authorizes the Cabinet to designate and cancel natural monuments, whereas South Korea designates and cancels natural monuments by the Cultural Heritage Administration through the deliberation of a Cultural Heritage Committee. Both Koreas' central administrations establish policies and their local governments carry them out, while their management systems are quite different. In conclusion, it is important to establish specified laws for the conservation of natural heritages and clarified standards of designation in order to improve the preservation and management system and to sustain the diversity of natural preservation. Moreover it is also necessary to discover resources in various fields, designate protection zones, and preserve imported trees. By doing so, we shall improve South Korea's natural monument management policies and ultimately enhance national homogeneity in preparation for the reunification of the Koreas in the future.

A Study on the Applicability of Deep Learning Algorithm for Detection and Resolving of Occlusion Area (영상 폐색영역 검출 및 해결을 위한 딥러닝 알고리즘 적용 가능성 연구)

  • Bae, Kyoung-Ho;Park, Hong-Gi
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.20 no.11
    • /
    • pp.305-313
    • /
    • 2019
  • Recently, spatial information is being constructed actively based on the images obtained by drones. Because occlusion areas occur due to buildings as well as many obstacles, such as trees, pedestrians, and banners in the urban areas, an efficient way to resolve the problem is necessary. Instead of the traditional way, which replaces the occlusion area with other images obtained at different positions, various models based on deep learning were examined and compared. A comparison of a type of feature descriptor, HOG, to the machine learning-based SVM, deep learning-based DNN, CNN, and RNN showed that the CNN is used broadly to detect and classify objects. Until now, many studies have focused on the development and application of models so that it is impossible to select an optimal model. On the other hand, the upgrade of a deep learning-based detection and classification technique is expected because many researchers have attempted to upgrade the accuracy of the model as well as reduce the computation time. In that case, the procedures for generating spatial information will be changed to detect the occlusion area and replace it with simulated images automatically, and the efficiency of time, cost, and workforce will also be improved.

Analysis on Characteristics of Sediment Produce by Landslide in a Basin 1. Simulation of Sediment Produce and its Verification (유역 내에서의 산사태에 의한 토사발생특성 분석 1. 토사발생모의 및 검증)

  • Yoo, Chul-Sang;Kim, Kee-Wook;Kim, Seong-Joon;Lee, Mi-Seon
    • Journal of the Korean Society of Hazard Mitigation
    • /
    • v.10 no.3
    • /
    • pp.133-145
    • /
    • 2010
  • This study analyzed the characteristics of sediment produce by landslide triggered by rainfall. One-dimensional unsaturated groundwater model and infinite slope stability analysis were used to estimate the behavior of soil moisture and slope stability according to rainfall, respectively. Slope stability analysis was performed considering on soil depth and characteristics of trees. As the results considering on recovery of the failed slopes, much amount of sediment was produced in 1963, 1970, and 2002. As the results of verification of simulation results using Landsat 5 TM images, we can find differences of landslide location between the results from model and satellite images. These differences can be caused by uncertainties of the rough parameters in the model. However, in the case that Obong-dam basin was divided into two subbasin, Wangsan-chun and Doma-chun basin, the results of each subbasin show errors around 20%. And only 4% of error occurred in the case of comparing landslide area on the entire Obong-dam basin. These errors seem insignificant considering on the errors which can be caused from the analyses in this study such as estimation of sediment produce, soil cover classification, and estimation of landslide area.

Plant Community Structure Characteristic of the Evergreen Forest, Cheonjangsan(Mt.) at GeoJae (거제도 천장산 일대 상록활엽수림의 식물군집구조 특성)

  • Lee, Gyounggyu;Lee, Soo-Dong;Kim, Ji-Suk;Cho, Bong-Gyo
    • Korean Journal of Environment and Ecology
    • /
    • v.33 no.6
    • /
    • pp.708-721
    • /
    • 2019
  • This study was conducted to understand the plant community structure characteristics of warm-temperate forest in Geoje Island. Survey sites were set up on ridges, valleys, and slopes where evergreen broad-leaved trees predominated or distributed in canopy, sub-canopy, or shrub layers at Chunjangsan(Mt.). Thirty-one sites were located in the areas, such as vegetation community, ridges, valleys, and slopes, to observe vegetation structure and location changes. The community classification with TWINSPAN identified six groups: Neolitsea sericea-Platycarya strobilacea, N. sericea-Styrax japonicus, N. sericea-Euonymus oxyphyllus, Pinus thunbergii-N. sericea, N. sericea-Quercus serrata, and Q. variabilis-P. strobilacea. Considering the results of previous studies that reported that the successional pattern of the warm temperate forests progressed from deciduous to evergreen forests, the regions predominated by deciduous communities such as P. thunbergii, Q. serrata, P. strobilacea, Zelkova serrata, and Q. variabilis, is likely to transform into the evergreen forest predominated by N. sericea. The relationship between the impact of the environmental factors and the vegetation distribution showed that slope, Na +, K +, electrical conductivity, and clay among physical properties had direct or indirect effects on vegetation distribution.