• Title/Summary/Keyword: cross-validation

Search Result 1,016, Processing Time 0.022 seconds

Forecasting the Precipitation of the Next Day Using Deep Learning (딥러닝 기법을 이용한 내일강수 예측)

  • Ha, Ji-Hun;Lee, Yong Hee;Kim, Yong-Hyuk
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.26 no.2
    • /
    • pp.93-98
    • /
    • 2016
  • For accurate precipitation forecasts the choice of weather factors and prediction method is very important. Recently, machine learning has been widely used for forecasting precipitation, and artificial neural network, one of machine learning techniques, showed good performance. In this paper, we suggest a new method for forecasting precipitation using DBN, one of deep learning techniques. DBN has an advantage that initial weights are set by unsupervised learning, so this compensates for the defects of artificial neural networks. We used past precipitation, temperature, and the parameters of the sun and moon's motion as features for forecasting precipitation. The dataset consists of observation data which had been measured for 40 years from AWS in Seoul. Experiments were based on 8-fold cross validation. As a result of estimation, we got probabilities of test dataset, so threshold was used for the decision of precipitation. CSI and Bias were used for indicating the precision of precipitation. Our experimental results showed that DBN performed better than MLP.

Development of Forest Volume Estimation Model Using Airborne LiDAR Data - A Case Study of Mixed Forest in Aedang-ri, Chunyang-myeon, Bonghwa-gun - (항공 LiDAR 자료를 이용한 산림재적추정 모델 개발 - 봉화군 춘양면 애당리 혼효림을 대상으로 -)

  • CHO, Seung-Wan;KIM, Yong-Ku;PARK, Joo-Won
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.20 no.3
    • /
    • pp.181-194
    • /
    • 2017
  • This study aims to develop a regression model for forest volume estimation using field-collected forest inventory information and airborne LiDAR data. The response variable of the model is forest stem volume, was measured by random sampling from each individual plot of the 30 circular sample plots collected in Bonghwa-gun, Gyeong sangbuk-do, while the predictor variables for the model are Height Percentiles(HP) and Height Bin(HB), which are metrics extracted from raw LiDAR data. In order to find the most appropriate model, the candidate models are constructed from simple linear regression, quadratic polynomial regression and multiple regression analysis and the cross-validation tests were conducted for verification purposes. As a result, $R^2$ of the multiple regression models of $HB_{5-10}$, $HB_{15-20}$, $HB_{20-25}$, and $HBgt_{25}$ among the estimated models was the highest at 0.509, and the PRESS statistic of the simple linear regression model of $HP_{25}$ was the lowest at 122.352. $HB_{5-10}$, $HB_{15-20}$, $HB_{20-25}$, and $HBgt_{25}-based$ models, thus, are comparatively considered more appropriate for Korean forests with complicated vertical structures.

Estimation of Vegetation for Chinese Cabbage Using Hyperspectral Imagery (초분광 영상을 이용한 배추의 생육 추정)

  • Kim, Won Jun;Kang, Ye Seong;Kim, Seong Heon;Kang, Jeong Gyun;Jun, Sae Rom;sarkar, Tapash Kumar;Ryu, Chan Seok
    • Proceedings of the Korean Society for Agricultural Machinery Conference
    • /
    • 2017.04a
    • /
    • pp.40-40
    • /
    • 2017
  • 본 연구는 빛의 파장대가 넓어 보다 다양한 접근과 검출이 가능한 초분광 카메라 (VNIR spectral camera PS, SPECIN Filand)를 이용하여 정식시기가 다른 배추를 생육단계별로 영상을 취득한 후 배추 캐노피의 전 파장 (400~1000nm)으로 생육 추정모델을 개발하기 위해 수행하였다. 정식시기가 다른 배추를 생육단계별로 초분광 카메라로 영상을 취득한 후 취득된 영상 ($348{\times}1040$)을 ENVI (ver. 5.2, Exelis Visual Information Solutions, USA) 프로그램을 이용하여 식생지수 NDVI로 작물과 배경을 구분하였다. 배추 캐노피 영역에 전 파장을 산출한 후 반사판 영역의 전 파장을 이용하여 광 보정된 반사율을 산출하였다. 통계 프로그램인 R Project (ver.3.3.3, Development Core Team, Vienna, Austria)를 이용하여 배추의 반사율과 계측한 생육 정보를 PLSR (Partial least squares regression) 분석하여 정확도($R^2$) 및 정밀도 (RMSE [g,cm,count], RE [%])로 나타내었고 그 모델은 full-cross validation (FV) 하여 타당성을 검증하였다. 정식시기가 다른 배추의 모든 생육단계의 생육정보를 이용하여 PLSR (Partial least squares regression) 결과 엽장을 추정한 모델의 $R^2$는 84% 이상의 정확도와 RMSE 3.2cm 이하의 좋은 정밀도를 보였다. 엽폭을 추정한 모델의 $R^2$는 73% 이상의 정확도와 RMSE 3.5cm 이하의 정밀도를 보였고 엽수를 추정한 모델의 $R^2$는 93% 이상의 정확도와 RMSE 6.3Count 이하의 정밀도로 보여 캐노피의 전 파장을 이용해 생육을 추정하는 것이 가능하다고 판단되었으며 이 모델들의 타당성 검증에서도 좋은 정확도와 정밀도를 보였다. 그러나 배추의 중요한 생육인자 중 생체중을 추정한 모델의 $R^2$는 89% 이상으로 정확도가 높았으나 RMSE 571.1g 이하로 낮은 정밀도를 보여 생체중을 정확히 추정하기 어려웠다. 따라서 다른 통계분석방법으로 전 파장과 생육정보를 분석하거나 특정 밴드를 선택하여 산출한 식생지수를 이용한 추정 모델의 개발을 통하여 오차를 개선할 필요가 있다고 사료된다. 추후 반복 실험하여 분석한 추정 모델과 비교 분석하여 다양한 환경 및 생물 조건에 범용성을 가진 모델을 개발할 필요가 있다.

  • PDF

Development of Nondestructive Detection Method for Adulterated Powder Products Using Raman Spectroscopy and Partial Least Squares Regression (라만 분광법과 부분최소자승법을 이용한 불량 분말식품 비파괴검사 기술 개발)

  • Lee, Sangdae;Lohumi, Santosh;Cho, Byoung-Kwan;Kim, Moon S.;Lee, Soo-Hee
    • Journal of the Korean Society for Nondestructive Testing
    • /
    • v.34 no.4
    • /
    • pp.283-289
    • /
    • 2014
  • This study was conducted to develop a non-destructive detection method for adulterated powder products using Raman spectroscopy and partial least squares regression(PLSR). Garlic and ginger powder, which are used as natural seasoning and in health supplement foods, were selected for this experiment. Samples were adulterated with corn starch in concentrations of 5-35%. PLSR models for adulterated garlic and ginger powders were developed and their performances evaluated using cross validation. The $R^2_c$ and SEC of an optimal PLSR model were 0.99 and 2.16 for the garlic powder samples, and 0.99 and 0.84 for the ginger samples, respectively. The variable importance in projection (VIP) score is a useful and simple tool for the evaluation of the importance of each variable in a PLSR model. After the VIP scores were taken pre-selection, the Raman spectrum data was reduced by one third. New PLSR models, based on a reduced number of wavelengths selected by the VIP scores technique, gave good predictions for the adulterated garlic and ginger powder samples.

A Study for Estimation of High Resolution Temperature Using Satellite Imagery and Machine Learning Models during Heat Waves (위성영상과 머신러닝 모델을 이용한 폭염기간 고해상도 기온 추정 연구)

  • Lee, Dalgeun;Lee, Mi Hee;Kim, Boeun;Yu, Jeonghum;Oh, Yeongju;Park, Jinyi
    • Korean Journal of Remote Sensing
    • /
    • v.36 no.5_4
    • /
    • pp.1179-1194
    • /
    • 2020
  • This study investigates the feasibility of three algorithms, K-Nearest Neighbors (K-NN), Random Forest (RF) and Neural Network (NN), for estimating the air temperature of an unobserved area where the weather station is not installed. The satellite image were obtained from Landsat-8 and MODIS Aqua/Terra acquired in 2019, and the meteorological ground weather data were from AWS/ASOS data of Korea Meteorological Administration and Korea Forest Service. In addition, in order to improve the estimation accuracy, a digital surface model, solar radiation, aspect and slope were used. The accuracy assessment of machine learning methods was performed by calculating the statistics of R2 (determination coefficient) and Root Mean Square Error (RMSE) through 10-fold cross-validation and the estimated values were compared for each target area. As a result, the neural network algorithm showed the most stable result among the three algorithms with R2 = 0.805 and RMSE = 0.508. The neural network algorithm was applied to each data set on Landsat imagery scene. It was possible to generate an mean air temperature map from June to September 2019 and confirmed that detailed air temperature information could be estimated. The result is expected to be utilized for national disaster safety management such as heat wave response policies and heat island mitigation research.

Building a Model for Estimate the Soil Organic Carbon Using Decision Tree Algorithm (의사결정나무를 이용한 토양유기탄소 추정 모델 제작)

  • Yoo, Su-Hong;Heo, Joon;Jung, Jae-Hoon;Han, Su-Hee
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.18 no.3
    • /
    • pp.29-35
    • /
    • 2010
  • Soil organic carbon (SOC), being a help to forest formation and control of carbon dioxide in the air, is found to be an important factor by which global warming is influenced. Excavating the samples by whole area is very inefficient method to discovering the distribution of SOC. So, the development of suitable model for expecting the relative amount of the SOC makes better use of expecting the SOC. In the present study, a model based on a decision tree algorithm is introduced to estimate the amount of SOC along with accessing influencing factors such as altitude, aspect, slope and type of trees. The model was applied to a real site and validated by 10-fold cross validation using two softwares, See 5 and Weka. From the results given by See 5, it can be concluded that the amount of SOC in surface layers is highly related to the type of trees, while it is, in middle depth layers, dominated by both type of trees and altitude. The estimation accuracy was rated as 70.8% in surface layers and 64.7% in middle depth layers. A similar result was, in surface layers, given by Weka, but aspect was, in middle depth layers, found to be a meaningful factor along with types of trees and altitude. The estimation accuracy was rated as 68.87% and 60.65% in surface and middle depth layers. The introduced model is, from the tests, conceived to be useful to estimation of SOC amount and its application to SOC map production for wide areas.

Forecasting of Customer's Purchasing Intention Using Support Vector Machine (Support Vector Machine 기법을 이용한 고객의 구매의도 예측)

  • Kim, Jin-Hwa;Nam, Ki-Chan;Lee, Sang-Jong
    • Information Systems Review
    • /
    • v.10 no.2
    • /
    • pp.137-158
    • /
    • 2008
  • Rapid development of various information technologies creates new opportunities in online and offline markets. In this changing market environment, customers have various demands on new products and services. Therefore, their power and influence on the markets grow stronger each year. Companies have paid great attention to customer relationship management. Especially, personalized product recommendation systems, which recommend products and services based on customer's private information or purchasing behaviors in stores, is an important asset to most companies. CRM is one of the important business processes where reliable information is mined from customer database. Data mining techniques such as artificial intelligence are popular tools used to extract useful information and knowledge from these customer databases. In this research, we propose a recommendation system that predicts customer's purchase intention. Then, customer's purchasing intention of specific product is predicted by using data mining techniques using receipt data set. The performance of this suggested method is compared with that of other data mining technologies.

A Melon Fruit Grading Machine Using a Miniature VIS/NIR Spectrometer: 2. Design Factors for Optimal Interactance Measurement Setup

  • Suh, Sang-Ryong;Lee, Kyeong-Hwan;Yu, Seung-Hwa;Shin, Hwa-Sun;Yoo, Soo-Nam;Choi, Yong-Soo
    • Journal of Biosystems Engineering
    • /
    • v.37 no.3
    • /
    • pp.177-183
    • /
    • 2012
  • Purpose: In near infrared spectroscopy, interactance configuration of a light source and a spectrometer probe can provide more information regarding fruit internal attributes, compared to reflectance and transmittance configuration. However, there is no through study on the parameters of interactance measurement setup. The objective of this study was to investigate the effect of the parameters on the estimation of soluble solids content (SSC) and firmness of muskmelons. Methods: Melon samples were taken from greenhouses at three different harvesting seasons. The prediction models were developed at three distances of 2, 5, and 8 cm between the light source and the spectrometer probe, three measurement points of 2, 3, and 6 evenly distributed on each sample, and different number of fruit samples for calibration models. The performance of the models was compared. Results: In the test at the three distances, the best results were found at a 5 cm distance. The coefficient of determination ($R_{cv}{^2}$) values of the cross-validation were 0.717 (standard error of prediction, SEP=$1.16^{\circ}Brix$) and 0.504 (SEP=4.31 N) for the estimation of SSC and firmness, respectively. The minimum measurement point required to fully represent the spectral characteristics of each fruit sample was 3. The highest $R_{cv}{^2}$ values were 0.736 (SEP=$0.87^{\circ}Brix$) and 0.644 (SEP=4.16 N) for the estimation of SSC and firmness, respectively. The performance of the models began to be saturated when 60 fruit samples were used for developing calibration models. The highest $R_{cv}{^2}$ of 0.713 (SEP=$0.88^{\circ}Brix$) and 0.750 (SEP=3.30 N) for the estimation of SSC and firmness, respectively, were achieved. Conclusions: The performance of the prediction models was quite different according to the condition of interactance measurement setup. In designing a fruit grading machine with interactance configuration, the parameters for interactance measurement setup should be chosen carefully.

An Analysis on Information Seeking Behavior and Needs of Hearing Impaired College Students (청각장애 대학생의 도서관 이용행태와 정보요구에 대한 연구)

  • Jang, Bo Seong
    • Journal of the Korean Society for information Management
    • /
    • v.32 no.1
    • /
    • pp.297-316
    • /
    • 2015
  • This study looks into how hearing-impaired college students use libraries and what their information needs are in order to prepare basic materials which would be applied for developing a library service program and others proper enough to be used by the hearing-impaired college students. In order to achieve the research goal, the study gathered data from a total of 155 hearing-impaired college students through a survey and interviews and a frequency analysis, a cross validation, a t-test and a one-way ANOVA were conducted to analyze the data. At the end of its research, the study confirmed that the hearing-impaired college students' gender, years, degrees of disability, schools, specialties and prosthetic appliances would make significant differences in how the students use the libraries. In addition, the study took a look into differences in the hearing-impaired college students' information needs caused by types of the students' prosthetic appliances, schools and degrees of disability and found out that these types of the prosthetic appliances the students use would significantly affect every category of their information needs. The study now also understands that both the schools and the degrees of disability would make significant differences in a few categories of the information needs, and the former influences education and promotion targeting users and arrangement of sign language interpreters while the latter affects education and promotion targeting users and improvements in browsing environments.

Development of Bioelectrical Impedance Analyzer for Korean in Telemedicine (원격의료계측을 위한 한국형 생체 전기 임피던스 분석 시스템의 개발)

  • 문재국;서광석;임택균;신태민;윤형로
    • Journal of Biomedical Engineering Research
    • /
    • v.23 no.5
    • /
    • pp.413-418
    • /
    • 2002
  • The purpose of this study was to design a single frequency BIA(Bioelectrical Impedance Analyzer) which can measure body impedance when patient is sitting on the toilet and to develope a prediction equation for designed BIA. For the purpose of this study, we acquired body impedances with designed BIA from 181 subjects composed of healthy Korean by attaching electrodes to suitable positions(wrist and thigh) for toilet measurement. We computed an appropriate FFM(Fat Free Mass) for Korean using modified-Siri equation to the same subjects instead of Siri equation which nay cause accuracy problems in hydrodensitometry when it applied to Korean. We used this FFM as reference value and developed a Korean FFM prediction equation based on body impedance index, body weight and sex. Correlation coefficient between prediction value and reference value of FFM was extremely high (r = 0.977) and SEE(Standard Error of Estimation) was low 2.47kg.(p<0.05) For comparison between existing electrode-attaching method and our method for toilet measurement, we acquired body impedance with designed BIA from same subjects attaching electrodes on existing positions (wrist and ankle) and made FFM prediction equation for BIA. Correlation coeffient between predicted value and reference value was 0.978 and SEE was 2.43kg(p<0.05). It means that the developed system has not significant differences with existing method. In conclusion bioelectrical impedance analyzer and the FFM prediction equation developed in this paper are evaluated to he adequate to compute FFM of Korean.