• Title/Summary/Keyword: forest statistics

Search Result 318, Processing Time 0.028 seconds

Site Index Equations and Estimation of Productive Areas for Major Pine Species by Climatic Zones Using Environmental Factors (기후대별 입지환경 인자에 의한 소나무류의 지위지수 추정식 및 적지 구명)

  • Shin, Man-Yong;Won, Hyung-Kyu;Lee, Seung-Woo;Lee, Yoon-Young
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.9 no.3
    • /
    • pp.179-187
    • /
    • 2007
  • This study was conducted to develop site index equations for some pine species by climatic zones based on the relationships between site index and environmental factors. The selected pine species were Pinus densiflora Sieb. et. Zucc., Pinus densiflora for, erecta, and Pinus thunbergii. A total of 28 environmental factors were obtained from a digital forest site map. The influence of 28 environmental factors on site index was evaluated by multiple regression analysis. Four to eight environmental factors were selected in the final site index equation for pine species by climatic zones. The site index equations developed in this study was then verified by three evaluation statistics such as model's estimation bias, model's precision and mean square error type of measure. We concluded that the site index equations for the pine species by climatic Bones were capable of estimating forest site productivity. Based on these site index equations, the amount of productive areas for the species by climatic zones was estimated by applying the GIS technique to digital forest maps.

Assessment of Carbon Stock and Uptake by Estimation of Stem Taper Equation for Pinus densiflora in Korea (우리나라 소나무의 수간곡선식 추정에 의한 탄소저장량 및 흡수량 산정)

  • Kang, Jin-Taek;Son, Yeong-Mo;Jeon, Ju-Hyeon;Lee, Sun-Jeoung
    • Journal of Climate Change Research
    • /
    • v.8 no.4
    • /
    • pp.415-424
    • /
    • 2017
  • This study was conducted to estimate carbon stocks of Pinus densiflora with drawing volume of trees in each tree height and DBH applying the suitable stem taper equation and tree specific carbon emission factors, using collected growth data from all over the country. Information on distribution area, tree age, tree number per hectare, tree volume and volume stocks were obtained from the $5^{th}$ National Forest Inventory (2006~2010) and Statistical yearbook of forest (2016), and method provided in IPCC GPG was applied to estimate carbon stock and uptake. Performance in predicting stem diameter at a specific point along a stem in Pinus densiflora by applying Kozak's model, $d=a_{1}DBH^{a_2}a_3^{DBH}X^{b_{1}Z^2+b_2ln(Z+0.001)+b_3\sqrt{Z}+b_4e^z+b_5(\frac{DBH}{H})}$, which is well known equation in stem taper estimation, was evaluated with validations statistics, Fitness Index, Bias and Standard Error of Bias. Consequently, Kozak's model turned out to be suitable in all validations statistics. Stem volume table of P. densiflora was derived by applying Kozak's model and carbon stock tables in each tree height and DBH were developed with country-specific carbon emission factors ($WD=0.445t/m^3$, BEF = 1.445, R = 0.255) of P. densiflora. As the results of analysis in carbon uptake for each province, the values were high with Gangwon-do $9.4tCO_2/ha/yr$, Gyeongsandnam-do and Gyeonggi-do $8.7tCO_2/ha/yr$, Chungcheongnam-do $7.9tCO_2/ha/yr$ and Gyeongsangbuk-do $7.8tCO_2/ha/yr$ in order, and Jeju-do was the lowest with $6.8tC/ha/yr$. Total carbon stocks of P. densiflora were 127,677 thousands tC which is 25.5% compared with total percentage of forest and carbon stock per hectare (ha) was $84.5tC/ha/yr$ and $7.8tCO_2/ha/yr$, respectively.

Rice yield prediction in South Korea by using random forest (Random Forest를 이용한 남한지역 쌀 수량 예측 연구)

  • Kim, Junhwan;Lee, Juseok;Sang, Wangyu;Shin, Pyeong;Cho, Hyeounsuk;Seo, Myungchul
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.21 no.2
    • /
    • pp.75-84
    • /
    • 2019
  • In this study, the random forest approach was used to predict the national mean rice yield of South Korea by using mean climatic factors at a national scale. A random forest model that used monthly climate variable and year as an important predictor in predicting crop yield. Annual yield change would be affected by technical improvement for crop management as well as climate. Year as prediction factor represent technical improvement. Thus, it is likely that the variables of importance identified for the random forest model could result in a large error in prediction of rice yield in practice. It was also found that elimination of the trend of yield data resulted in reasonable accuracy in prediction of yield using the random forest model. For example, yield prediction using the training set (data obtained from 1991 to 2005) had a relatively high degree of agreement statistics. Although the degree of agreement statistics for yield prediction for the test set (2006-2015) was not as good as those for the training set, the value of relative root mean square error (RRMSE) was less than 5%. In the variable importance plot, significant difference was noted in the importance of climate factors between the training and test sets. This difference could be attributed to the shifting of the transplanting date, which might have affected the growing season. This suggested that acceptable yield prediction could be achieved using random forest, when the data set included consistent planting or transplanting dates in the predicted area.

Comparative study of prediction models for corporate bond rating (국내 회사채 신용 등급 예측 모형의 비교 연구)

  • Park, Hyeongkwon;Kang, Junyoung;Heo, Sungwook;Yu, Donghyeon
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.3
    • /
    • pp.367-382
    • /
    • 2018
  • Prediction models for a corporate bond rating in existing studies have been developed using various models such as linear regression, ordered logit, and random forest. Financial characteristics help build prediction models that are expected to be contained in the assigning model of the bond rating agencies. However, the ranges of bond ratings in existing studies vary from 5 to 20 and the prediction models were developed with samples in which the target companies and the observation periods are different. Thus, a simple comparison of the prediction accuracies in each study cannot determine the best prediction model. In order to conduct a fair comparison, this study has collected corporate bond ratings and financial characteristics from 2013 to 2017 and applied prediction models to them. In addition, we applied the elastic-net penalty for the linear regression, the ordered logit, and the ordered probit. Our comparison shows that data-driven variable selection using the elastic-net improves prediction accuracy in each corresponding model, and that the random forest is the most appropriate model in terms of prediction accuracy, which obtains 69.6% accuracy of the exact rating prediction on average from the 5-fold cross validation.

Matching prediction on Korean professional volleyball league (한국 프로배구 연맹의 경기 예측 및 영향요인 분석)

  • Heesook Kim;Nakyung Lee;Jiyoon Lee;Jongwoo Song
    • The Korean Journal of Applied Statistics
    • /
    • v.37 no.3
    • /
    • pp.323-338
    • /
    • 2024
  • This study analyzes the Korean professional volleyball league and predict match outcomes using popular machine learning classification methods. Match data from the 2012/2013 to 2022/2023 seasons for both male and female leagues were collected, including match details. Two different data structures were applied to the models: Separating matches results into two teams and performance differentials between the home and away teams. These two data structures were applied to construct a total of four predictive models, encompassing both male and female leagues. As specific variable values used in the models are unavailable before the end of matches, the results of the most recent 3 to 4 matches, up until just before today's match, were preprocessed and utilized as variables. Logistc Regrssion, Decision Tree, Bagging, Random Forest, Xgboost, Adaboost, and Light GBM, were employed for classification, and the model employing Random Forest showed the highest predictive performance. The results indicated that while significant variables varied by gender and data structure, set success rate, blocking points scored, and the number of faults were consistently crucial. Notably, our win-loss prediction model's distinctiveness lies in its ability to provide pre-match forecasts rather than post-event predictions.

The Current Status of Aggregate Industry in Korea (우리나라 골재산업의 현황)

  • Oh, Jae-Hyun
    • Resources Recycling
    • /
    • v.25 no.4
    • /
    • pp.80-86
    • /
    • 2016
  • To investigate the current status of aggregate industry in Korea, the law of aggregate gathering, the law of forest management, the aggregate statistics of demand and supply in recent years, and market price of aggregate were reviewed. It is conformed that the forest aggregate industry is developing year by year and leading the industry. In addition, in order to well understanding about aggregate industry, the production system and process of the Whaseong forest aggregate quarry were introduced.

Spatial Patterns of Forest Fires between 1991 and 2007 (1991년부터 2007년까지 산불의 공간적 특성)

  • Lee, Byung-Doo;Lee, Myung-Bo
    • Fire Science and Engineering
    • /
    • v.23 no.1
    • /
    • pp.15-20
    • /
    • 2009
  • For the effective management of forest fire, understanding of regional forest fire patterns is needed. In this paper, forest fire ignition and spread characteristics were analyzed based on forest fire statistics. Fire occurrences, burned area, rate of spread, and burned area per fire between 1991 and 2007 were parameterized for the cluster analysis, which results were displayed using GIS to detect spatial patterns of forest fire. Administrative districts such as cities and counties were classified into 5 clusters by fire susceptibility. Metropolitan areas had fire characteristics that were infrequent, slow rate of spread, and small burned area. However, 4 cities and counties showing fast rate of spread, and large burned area, in the eastern regions of Taeback Mountain range, were the most susceptible areas to forest fire. The next vulnerable cities and counties were located in the West and South Coast area.

Estimation of Carbon Stock by Development of Stem Taper Equation and Carbon Emission Factors for Quercus serrata (수간곡선식 개발과 국가탄소배출계수를 이용한 졸참나무의 탄소저장량 추정)

  • Kang, Jin-Taek;Son, Yeong-Mo;Jeon, Ju-Hyeon;Yoo, Byung-Oh
    • Journal of Climate Change Research
    • /
    • v.6 no.4
    • /
    • pp.357-366
    • /
    • 2015
  • This study was conducted to estimate carbon stocks of Quercus serrata with drawing volume of trees in each tree height and DBH applying the suitable stem taper equation and tree specific carbon emission factors, using collected growth data from all over the country. Information on distribution area, tree number per hectare, tree volume and volume stocks were obtained from the $5^{th}$ National Forest Inventory (2006~2010), and method provided in IPCC GPG was applied to estimate carbon storage and removals. Performance in predicting stem diameter at a specific point along a stem in Quercus serrata by applying Kozak's model,$d=a_1DBH^{a_2}a_3^{DBH}X^{b_1Z^2+b_2ln(Z+0.001)+b_3{\sqrt{Z}}+b_4e^Z+b_5({\frac{DBH}{H}})}$, which is well known equation in stem taper estimation, was evaluated with validations statistics, Fitness Index, Bias and Standard Error of Bias. Consequently, Kozak's model turned out to be suitable in all validations statistics. Stem volume tables of Quercus serrata were derived by applying Kozak's model and carbon stock tables in each tree height and DBH were developed with country-specific carbon emission factors ($WD=0.65t/m^3$, BEF=1.55, R=0.43) of Quercus serrata. As a result of carbon stock analysis by age class in Quercus serrata, carbon stocks of IV age class (11,358 ha, 36.5%) and V age class (10,432; 33.5%) which take up the largest area in distribution of age class were 957,000 tC and 1,312,000 tC. Total carbon stocks of Quercus serrata were 3,191,000 tC which is 3% compared with total percentage of broad-leaved forest and carbon sequestration per hectare(ha) was 3.8 tC/ha/yr, $13.9tCO_2/ha/yr$, respectively.

Predicting Gross Box Office Revenue for Domestic Films

  • Song, Jongwoo;Han, Suji
    • Communications for Statistical Applications and Methods
    • /
    • v.20 no.4
    • /
    • pp.301-309
    • /
    • 2013
  • This paper predicts gross box office revenue for domestic films using the Korean film data from 2008-2011. We use three regression methods, Linear Regression, Random Forest and Gradient Boosting to predict the gross box office revenue. We only consider domestic films with a revenue size of at least KRW 500 million; relevant explanatory variables are chosen by data visualization and variable selection techniques. The key idea of analyzing this data is to construct the meaningful explanatory variables from the data sources available to the public. Some variables must be categorized to conduct more effective analysis and clustering methods are applied to achieve this task. We choose the best model based on performance in the test set and important explanatory variables are discussed.

Comparison of tree-based ensemble models for regression

  • Park, Sangho;Kim, Chanmin
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.5
    • /
    • pp.561-589
    • /
    • 2022
  • When multiple classifications and regression trees are combined, tree-based ensemble models, such as random forest (RF) and Bayesian additive regression trees (BART), are produced. We compare the model structures and performances of various ensemble models for regression settings in this study. RF learns bootstrapped samples and selects a splitting variable from predictors gathered at each node. The BART model is specified as the sum of trees and is calculated using the Bayesian backfitting algorithm. Throughout the extensive simulation studies, the strengths and drawbacks of the two methods in the presence of missing data, high-dimensional data, or highly correlated data are investigated. In the presence of missing data, BART performs well in general, whereas RF provides adequate coverage. The BART outperforms in high dimensional, highly correlated data. However, in all of the scenarios considered, the RF has a shorter computation time. The performance of the two methods is also compared using two real data sets that represent the aforementioned situations, and the same conclusion is reached.