• Title/Summary/Keyword: 분류나무

Search Result 986, Processing Time 0.029 seconds

An Empirical Study of Profiling Model for the SMEs with High Demand for Standards Using Data Mining (데이터마이닝을 이용한 표준정책 수요 중소기업의 프로파일링 연구: R&D 동기와 사업화 지원 정책을 중심으로)

  • Jun, Seung-pyo;Jung, JaeOong;Choi, San
    • Journal of Korea Technology Innovation Society
    • /
    • v.19 no.3
    • /
    • pp.511-544
    • /
    • 2016
  • Standards boost technological innovation by promoting information sharing, compatibility, stability and quality. Identifying groups of companies that particularly benefit from these functions of standards in their technological innovation and commercialization helps to customize planning and implementation of standards-related policies for demand groups. For this purpose, this study engages in profiling of SMEs whose R&D objective is to respond to standards as well as those who need to implement standards system for technological commercialization. Then it suggests a prediction model that can distinguish such companies from others. To this end, decision tree analysis is conducted for profiling of characteristics of subject SMEs through data mining. Subject SMEs include (1) those that engage in R&D to respond to standards (Group1) or (2) those in need of product standard or technological certification policies for commercialization purposes (Group 2). Then the study proposes a prediction model that can distinguish Groups 1 and 2 from others based on several variables by adopting discriminant analysis. The practicality of discriminant formula is statistically verified. The study suggests that Group 1 companies are distinguished in variables such as time spent on R&D planning, KoreanStandardIndustryClassification (KSIC) category, number of employees and novelty of technologies. Profiling result of Group 2 companies suggests that they are differentiated in variables such as KSIC category, major clients of the companies, time spent on R&D and ability to test and verify their technologies. The prediction model proposed herein is designed based on the outcomes of profiling and discriminant analysis. Its purpose is to serve in the planning or implementation processes of standards-related policies through providing objective information on companies in need of relevant support and thereby to enhance overall success rate of standards-related projects.

Distribution, Habitat Characteristics and Assessment of the Conservation Status of a Rare Mistletoe Species, Loranthus tanakae(Loranthaceae) in Korea (희귀식물 꼬리겨우살이의 분포와 생태적 특성 및 보전지위 평가)

  • Lee, Su Gwang;Chung, Jae Min;Kim, Sung Sik;Woo, Su Young;Kang, Ho Duck
    • Journal of Korean Society of Forest Science
    • /
    • v.102 no.3
    • /
    • pp.428-436
    • /
    • 2013
  • To obtain biological basic data for the conservation strategies establishment of a rare mistletoe species, Loranthus tanakae(Loranthaceae) in Korea, the distribution range, habitat characteristics and an assessment of the conservation status on the natural populations of L. tanakae were investigated. As a result, the natural populations of L. tanakae were distributed in Bakdudaegan from Mt. Chiri to Mt Seorak in Korea penninsula, and 97.8% of surveyed individuals of the mistletoe species were found in Gangwon province. In natural populations, 1,385 individuals of L. tanakae were parasitic on 480 host trees, and distributed in sunshiny ridges of mountains of altitude range of 353 m to 1,250 m. The range of host trees of L. tanakae were composed of 5 families, 6 genera, 9 species, 1 subspecies, and of these, Quercus mongolica was preferred with 81.5% (389 trees among 480 host trees). As a result of assessing the conservation status through IUCN, L. tanakae was evaluated as Vulnerable (VU). It was considered that Mt. Seorak, Mt. Taegi and Mt. Odae population as habitats with the highest density of distribution of natural populations of L. tanakae should be designated as the protected areas. Thus, conservation strategies and related methods of the natural populations of L. tanakae were also discussed for the sustainable conservation.

Identification of host plant species of Balanophora fungosa var. indica from Phnom Bokor National Park of Cambodia using DNA barcoding technique (캄보디아 프놈보콜국립공원의 Balanophora fungosa var. indica의 숙주식물에 대한 DNA barcoding 기법을 통한 동정)

  • Kim, Joo Hwan;Won, Hyosig
    • Korean Journal of Plant Taxonomy
    • /
    • v.43 no.4
    • /
    • pp.252-262
    • /
    • 2013
  • During the floristic survey on Phnom Bokor National Park, Kampot, Cambodia, we encountered Balanophora fungosa var. indica, which is a tropical holoparasitic plant. To identify its host species, we collected host roots and trees nearby and tried to identify them using DNA barcoding approach. We applied plastid rbcL and matK gene regions as DNA barcode markers, and successfully amplified and sequenced the markers from 15 host roots and seven tree samples. Obtained host root sequences were identified as Primulaceae, Celastraceae, Myrtaceae, and Oleaceae, while trees nearby are Oleaceae, Myrtaceae, Sapindaceae, Rosaceae, Clusiaceae, Ericaceae, and Lauraceae. At genus level, host species are identified as Myrsine, Euonymus, Syzygium, and Olea, but failed in species discrimination. Myrsine (Primulaceae) and Olea (Oleaceae) are reported here as host species of B. fungosa var. indica for the first time. Further sampling and comparative work, and DNA barcoding will help recognize the biodiversity of the area and host species of Balanophora, together with their evolution.

Analysis of Important Indicators of TCB Using GBM (일반화가속모형을 이용한 기술신용평가 주요 지표 분석)

  • Jeon, Woo-Jeong(Michael);Seo, Young-Wook
    • The Journal of Society for e-Business Studies
    • /
    • v.22 no.4
    • /
    • pp.159-173
    • /
    • 2017
  • In order to provide technical financial support to small and medium-sized venture companies based on technology, the government implemented the TCB evaluation, which is a kind of technology rating evaluation, from the Kibo and a qualified private TCB. In this paper, we briefly review the current state of TCB evaluation and available indicators related to technology evaluation accumulated in the Korea Credit Information Services (TDB), and then use indicators that have a significant effect on the technology rating score. Multiple regression techniques will be explored. And the relative importance and classification accuracy of the indicators were calculated by applying the key indicators as independent features applied to the generalized boosting model, which is a representative machine learning classifier, as the class influence and the fitness of each model. As a result of the analysis, it was analyzed that the relative importance between the two models was not significantly different. However, GBM model had more weight on the InnoBiz certification, R&D department, patent registration and venture confirmation indicators than regression model.

Syntaxonomy and Syngeography of Korean Red Pine (Pinus densiflora) Forests in Korea (한국 소나무림의 군락분류와 군락지리)

  • Chun, Young-Moon;Lee, Ho-Joon;Hayashi, Ichiroku
    • Korean Journal of Environment and Ecology
    • /
    • v.21 no.3
    • /
    • pp.257-277
    • /
    • 2007
  • We carried out a phytosociological study on pine forests in Korea with the method of Zurich-Montpellier School. We collected the data of 252 $relev{\acute{e}}s$ from 45 sites in the pine forests throughout the Korean Peninsula and its attached islands. The vegetation of the pine forests was classified into one association, three communities and seven subcommunities as follows: A: Quercus mongolica-Pinus densiflora community, A-1: Typical subcommunity, A-2: Vaccinium korean urn subcommunity, A-3 : Rhododendron micranthum subcommunity, B: Quercus serrata-Pinus densiflora community, B-1: Typical subcommunity, B-2: Juniperus rigida subcommunity, B-3: Styrax japonica subcommunity, B-4: Eurya japonica subcommunity, C: Saso-Pinetum densiflorae Yim et al. 1990, and D: Castanopsis cuspidata var. sieboldii-Pinus densiflora community. The former three communities were integrated into the Lindero-Quercion mongolicae Kim 1990 em. 1992. The Castanopsis cuspidata var. sieboldii-Pinus densiflora community remained to be studied in future to determine the association. The communities of Quercus mongolica-Pinus densiflora community was distributed throughout the montane zone in central-northern part of the Korean Peninsula. Quercus serrata-Pinus densiflora community occupied widely in the sub-montane and hilly areas in central and Southern Korean Peninsula. The association of Saso-Pinetum densiflorae was found in Cheju Island. Castanopsis cuspidata var. sieboldii-Pinus densiflora community were distributed in the warm-temperate zone including islands off the south-west coast of the Peninsula.

Predicting Corporate Bankruptcy using Simulated Annealing-based Random Fores (시뮬레이티드 어니일링 기반의 랜덤 포레스트를 이용한 기업부도예측)

  • Park, Hoyeon;Kim, Kyoung-jae
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.155-170
    • /
    • 2018
  • Predicting a company's financial bankruptcy is traditionally one of the most crucial forecasting problems in business analytics. In previous studies, prediction models have been proposed by applying or combining statistical and machine learning-based techniques. In this paper, we propose a novel intelligent prediction model based on the simulated annealing which is one of the well-known optimization techniques. The simulated annealing is known to have comparable optimization performance to the genetic algorithms. Nevertheless, since there has been little research on the prediction and classification of business decision-making problems using the simulated annealing, it is meaningful to confirm the usefulness of the proposed model in business analytics. In this study, we use the combined model of simulated annealing and machine learning to select the input features of the bankruptcy prediction model. Typical types of combining optimization and machine learning techniques are feature selection, feature weighting, and instance selection. This study proposes a combining model for feature selection, which has been studied the most. In order to confirm the superiority of the proposed model in this study, we apply the real-world financial data of the Korean companies and analyze the results. The results show that the predictive accuracy of the proposed model is better than that of the naïve model. Notably, the performance is significantly improved as compared with the traditional decision tree, random forests, artificial neural network, SVM, and logistic regression analysis.

Restoration of endangered orchid species, Dendrobium moniliforme (L.) Sw. (Orchidaceae) in Korea (멸종위기 난과 식물 석곡의 복원)

  • Kim, Young-kee;Kang, Kyung-Won;Kim, Ki-Joong
    • Korean Journal of Plant Taxonomy
    • /
    • v.46 no.2
    • /
    • pp.256-266
    • /
    • 2016
  • A total of 13,000 individuals of Dendrobium moniliforme (L.) Sw. artificially propagated in laboratories and greenhouses were restored in their natural habitat of Bogildo Island, Wandogun, in the southern part of Korea in June of 2013. The growing conditions of the individuals were monitored for two years. The parental individuals for the restoration were obtained from a wild population in southern Korea, from which seeds were produced via artificial crossings. These seeds were germinated and cultivated in growing media and two-year-old plants were then grown in greenhouse beds. The genetic diversity among the propagated individuals was confirmed by examining DNA sequences of five regions of the chloroplast genome and the nuclear ITS region. The diversity values were as high as the average values of natural populations. All propagated individuals were transplanted into two different sites on Bogildo by research teams with local residents and national park rangers. After restoration, we counted and measured the surviving individuals, vegetative propagated stems, and growth rates in June of both 2014 and 2015. There was no human interference, and 97% of the individuals survived. The number of propagules increased by 227% in two years. In contrast, the average length of the stems decreased during the period. In addition, different survival and propagation rates were recorded depending on the host plants and the restored sites. The shaded sides of rock cliffs and the bark of Quercus salicina showed the best propagation rates, followed by the bark of Camellia japonica. A few individuals of D. moniliforme successfully flowered, pollinated, and fruited after restoration. Overall, our monitoring data over two years indicate that the restored individuals were well adapted and vigorously propagated at the restored sites. In order to prevent human disturbance of the restored sites, a CCTV monitoring system powered by a solar panel was installed after the restoration. In addition, a human surveillance system is operated by national park rangers with local residents.

A Study on Researches of Resource-plants for Special Use or Purpose - Based on the Articles Published in the Journal of Korean Forestry - (특용자원식물(特用資源植物)의 연구(硏究) - 한국임학회지에 게재된 논문을 중심으로 -)

  • Yi, Jae-Seon;Kim, Chul-Woo;Song, Jae-Mo;Bae, Chan-Ho;Kang, Hyo-Jin;Hwang, Suk-In;Moon, Heung-Kyu
    • Journal of Forest and Environmental Science
    • /
    • v.19 no.1
    • /
    • pp.85-98
    • /
    • 2003
  • The articles, published in the Journal of Korean Forestry from Number 1(1962) to Number 6, Volume 91(2002), were surveyed and investigated for the research trend analysis about resource-plants for special use or purpose, i. e., edible plants, medicinal plants, feed resource, landscape plants, fiber plants, industrial usuage, and bee plants. If the purpose or subject matter of the research was construction or furniture timber production, mushrooms and/or pulp and paper, such research was not included in this study. These articles were classified again depending on the content of research into 14 categories: habitat environment, ecology, physiology, propagation, silviculture (tending and culture), genetics and breeding, identification, insect and disease control, animal-related research, component analysis, vegetation survey, biotechnology, management, and review. Among the total 1.434 articles published, 396 ones (27.6%) were related with plants for special use or purpose. Vegetation survey was 60 (15.2%): physiology 56(14.1%) : genetics and breeding 56(14.1%): propagation 53(13.4%): and ecology 37(9.3%). Siviculture research field included 11 articles (2.8%), which indicates that the management of resource-plants is so far from economic income as seen in the low number of management research filed articles, i. e., only 6 reports (1.5%) Korean white pine was most popular for research and included 42 articles: Robinia pseudoacacia 23: Castanea crenata 14: and ginkgo tree 14. Research related with these species had focused mainly on propagation, physiology, genetics and breeding, ecology and pest control. Based on this survey and analysis, the followings are suggested: 1. More research is required on forest herbaceous plants. 2. Cooperative research work with other industrial and/or scientific area is recommendable for commercialization including medicine, cosmetics, and food etc. 3. Research on resource-plant conservation, which includes biology, social education and policy, should be supported for next generation. 4. Mutual correspondence and information exchange about the research results between researchers and institutes is more necessary than now.

  • PDF

Analysis of Vegetation-Environment Telationships of Main Wild Vegetables on Short-term Income Forest Products, in Korea (단기소득임산물 자생지 주요 산채류 식생과 환경의 상관관계 분석)

  • Kim, Hyoun-Sook;Lee, Sang-Myong;Lee, Joongku
    • Korean Journal of Environment and Ecology
    • /
    • v.33 no.4
    • /
    • pp.447-452
    • /
    • 2019
  • This study was conducted in 2016-2017 to provide the basic ecological data needed to establish environmental conditions for the cultivation of wild vegetables. It used TWINSPAN to classify the vegetation structure of natural habitats of wild vegetable nationwide and DCCA ordination to analyze the correlation between the by community structure and environmental factors. We performed TWINSPAN on 100 taxa with high importance values in 91 plots of major habitats of wild vegetables. The vegetation was classified into Cirsium setidens and Synurus deltoides group, Ligularia fischeri and Hemerocallis fulva group, Adenophora divaricata var. manshurica group, Platycodon grandiflorum and Aster scaber group, Aralia elata and Pteridium aquilinum group, and Pimpinella brachycarpa and Osmunda japonica group communities. We then performed DCCA ordination of 11 communities classified by TWINSPAN and 11 environmental factors. The results showed that the altitude had the strongest correlation with the vegetation. The Cirsium setidens, Synurus deltoids, and Lifularia fischeri communities were distributed in areas with similar environmental factors such as high altitude, gentle slope, and nutrient. The Aralia elata and Osmunda japonica communities were distributed in the location environment with low altitude, pH, O.M, T-N, $Ca^{2+}$, and C.E.C. The Hemerocallis fulva community was distributed in the location environment with moderate northeastern and northwestern slope, low altitude and pH, and high $P_2O_5$, whereas the Adenophora divaricata var. manshurica community was distributed in the location environment with gentle southeastern and southwestern slope, high altitude and pH, and low $P_2O_5$, which was the opposite tendency of the location environment from Hemerocallis fulva community. The Platycodon grandiflorum community was distributed in the location environment with gentle southwestern slope, low altitude, pH, O.M, T-N, $P_2O_5$, $Ca^{2+}$, and C.E.C., and high $Mg^{2+}$. The Pteridium aquilinum community was distributed in the location environment with southwestern slope, low altitude, O.M, T-N, C.E.C, $P_2O_5$, $Ca^{2+}$, and $K^+$. The Aster scaber and Pimpinella brachycarpa communities were widely distributed in many plots with various location environments.

A Hybrid SVM Classifier for Imbalanced Data Sets (불균형 데이터 집합의 분류를 위한 하이브리드 SVM 모델)

  • Lee, Jae Sik;Kwon, Jong Gu
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.125-140
    • /
    • 2013
  • We call a data set in which the number of records belonging to a certain class far outnumbers the number of records belonging to the other class, 'imbalanced data set'. Most of the classification techniques perform poorly on imbalanced data sets. When we evaluate the performance of a certain classification technique, we need to measure not only 'accuracy' but also 'sensitivity' and 'specificity'. In a customer churn prediction problem, 'retention' records account for the majority class, and 'churn' records account for the minority class. Sensitivity measures the proportion of actual retentions which are correctly identified as such. Specificity measures the proportion of churns which are correctly identified as such. The poor performance of the classification techniques on imbalanced data sets is due to the low value of specificity. Many previous researches on imbalanced data sets employed 'oversampling' technique where members of the minority class are sampled more than those of the majority class in order to make a relatively balanced data set. When a classification model is constructed using this oversampled balanced data set, specificity can be improved but sensitivity will be decreased. In this research, we developed a hybrid model of support vector machine (SVM), artificial neural network (ANN) and decision tree, that improves specificity while maintaining sensitivity. We named this hybrid model 'hybrid SVM model.' The process of construction and prediction of our hybrid SVM model is as follows. By oversampling from the original imbalanced data set, a balanced data set is prepared. SVM_I model and ANN_I model are constructed using the imbalanced data set, and SVM_B model is constructed using the balanced data set. SVM_I model is superior in sensitivity and SVM_B model is superior in specificity. For a record on which both SVM_I model and SVM_B model make the same prediction, that prediction becomes the final solution. If they make different prediction, the final solution is determined by the discrimination rules obtained by ANN and decision tree. For a record on which SVM_I model and SVM_B model make different predictions, a decision tree model is constructed using ANN_I output value as input and actual retention or churn as target. We obtained the following two discrimination rules: 'IF ANN_I output value <0.285, THEN Final Solution = Retention' and 'IF ANN_I output value ${\geq}0.285$, THEN Final Solution = Churn.' The threshold 0.285 is the value optimized for the data used in this research. The result we present in this research is the structure or framework of our hybrid SVM model, not a specific threshold value such as 0.285. Therefore, the threshold value in the above discrimination rules can be changed to any value depending on the data. In order to evaluate the performance of our hybrid SVM model, we used the 'churn data set' in UCI Machine Learning Repository, that consists of 85% retention customers and 15% churn customers. Accuracy of the hybrid SVM model is 91.08% that is better than that of SVM_I model or SVM_B model. The points worth noticing here are its sensitivity, 95.02%, and specificity, 69.24%. The sensitivity of SVM_I model is 94.65%, and the specificity of SVM_B model is 67.00%. Therefore the hybrid SVM model developed in this research improves the specificity of SVM_B model while maintaining the sensitivity of SVM_I model.