• Title/Summary/Keyword: Kernel Regression

Search Result 239, Processing Time 0.025 seconds

Predicting Daily Nutrient Water Consumption by Strawberry Plants in a Greenhouse Environment

  • Sathishkumar, VE;Lee, Myeong-Bae;Lim, Jong-Hyun;Shin, Chang-Sun;Park, Chang-Woo;Cho, Yong Yun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.10a
    • /
    • pp.581-584
    • /
    • 2019
  • Food consumption is growing worldwide every year owing to a growing population. Hence, the increasing population needs the production of sufficient and good quality food products. Strawberry is one of the world's most famous fruit. To obtain the highest strawberry output, we worked with three strawberry varieties supplied with three kinds of nutrient water in a greenhouse and with the outcome of the strawberry production, the highest yielding strawberry variety is detected. This Study uses the nutrient water consumed every day by the highest yielding strawberry variety. The atmospheric temperature, humidity and CO2 levels within the greenhouse are identified and used for the prediction, since the water consumption by any plant depends primarily on weather conditions. Machine learning techniques show successful outcomes in a multitude of issues including time series and regression issues. In this study, daily nutrient water consumption of strawberry plants is predicted using machine learning algorithms is proposed. Four Machine learning algorithms are used such as Linear Regression (LR), K nearest neighbour (KNN), Support Vector Machine with Radial Kernel (SVM) and Gradient Boosting Machine (GBM). Gradient Boosting System produces the best results.

Analysis of Dimensionality Reduction Methods Through Epileptic EEG Feature Selection for Machine Learning in BCI (BCI에서 기계 학습을 위한 간질 뇌파 특징 선택을 통한 차원 감소 방법 분석)

  • Tong, Yang;Aliyu, Ibrahim;Lim, Chang-Gyoon
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.13 no.6
    • /
    • pp.1333-1342
    • /
    • 2018
  • Until now, Electroencephalography(: EEG) has been the most important and convenient method for the diagnosis and treatment of epilepsy. However, it is difficult to identify the wave characteristics of an epileptic EEG signals because it is very weak, non-stationary and has strong background noise. In this paper, we analyse the effect of dimensionality reduction methods on Epileptic EEG feature selection and classification. Three dimensionality reduction methods: Pincipal Component Analysis(: PCA), Kernel Principal Component Analysis(: KPCA) and Linear Discriminant Analysis(: LDA) were investigated. The performance of each method was evaluated by using Support Vector Machine SVM, Logistic Regression(: LR), K-Nearestneighbor(: K-NN), Decision Tree(: DR) and Random Forest(: RF). From the experimental result, PCA recorded 75% of highest accuracy in SVM, LR and K-NN. KPCA recorded 85% of best performance in SVM and K-KNN while LDA achieved 100% accuracy in K-NN. Thus, LDA dimensionality reduction is found to provide the best classification result for epileptic EEG signal.

VALIDATION OF ON-LINE MONITORING TECHNIQUES TO NUCLEAR PLANT DATA

  • Garvey, Jamie;Garvey, Dustin;Seibert, Rebecca;Hines, J. Wesley
    • Nuclear Engineering and Technology
    • /
    • v.39 no.2
    • /
    • pp.133-142
    • /
    • 2007
  • The Electric Power Research Institute (EPRI) demonstrated a method for monitoring the performance of instrument channels in Topical Report (TR) 104965, 'On-Line Monitoring of Instrument Channel Performance.' This paper presents the results of several models originally developed by EPRI to monitor three nuclear plant sensor sets: Pressurizer Level, Reactor Protection System (RPS) Loop A, and Reactor Coolant System (RCS) Loop A Steam Generator (SG) Level. The sensor sets investigated include one redundant sensor model and two non-redundant sensor models. Each model employs an Auto-Associative Kernel Regression (AAKR) model architecture to predict correct sensor behavior. Performance of each of the developed models is evaluated using four metrics: accuracy, auto-sensitivity, cross-sensitivity, and newly developed Error Uncertainty Limit Monitoring (EULM) detectability. The uncertainty estimate for each model is also calculated through two methods: analytic formulas and Monte Carlo estimation. The uncertainty estimates are verified by calculating confidence interval coverages to assure that 95% of the measured data fall within the confidence intervals. The model performance evaluation identified the Pressurizer Level model as acceptable for on-line monitoring (OLM) implementation. The other two models, RPS Loop A and RCS Loop A SG Level, highlight two common problems that occur in model development and evaluation, namely faulty data and poor signal selection

Hybrid Learning Architectures for Advanced Data Mining:An Application to Binary Classification for Fraud Management (개선된 데이터마이닝을 위한 혼합 학습구조의 제시)

  • Kim, Steven H.;Shin, Sung-Woo
    • Journal of Information Technology Application
    • /
    • v.1
    • /
    • pp.173-211
    • /
    • 1999
  • The task of classification permeates all walks of life, from business and economics to science and public policy. In this context, nonlinear techniques from artificial intelligence have often proven to be more effective than the methods of classical statistics. The objective of knowledge discovery and data mining is to support decision making through the effective use of information. The automated approach to knowledge discovery is especially useful when dealing with large data sets or complex relationships. For many applications, automated software may find subtle patterns which escape the notice of manual analysis, or whose complexity exceeds the cognitive capabilities of humans. This paper explores the utility of a collaborative learning approach involving integrated models in the preprocessing and postprocessing stages. For instance, a genetic algorithm effects feature-weight optimization in a preprocessing module. Moreover, an inductive tree, artificial neural network (ANN), and k-nearest neighbor (kNN) techniques serve as postprocessing modules. More specifically, the postprocessors act as second0order classifiers which determine the best first-order classifier on a case-by-case basis. In addition to the second-order models, a voting scheme is investigated as a simple, but efficient, postprocessing model. The first-order models consist of statistical and machine learning models such as logistic regression (logit), multivariate discriminant analysis (MDA), ANN, and kNN. The genetic algorithm, inductive decision tree, and voting scheme act as kernel modules for collaborative learning. These ideas are explored against the background of a practical application relating to financial fraud management which exemplifies a binary classification problem.

  • PDF

Variation in Energy and Nutrient Composition of Oilseed Meals from Different Countries (수입 박류사료내 에너지 및 영양소 함량의 변이)

  • Son, Ah Reum
    • Korean Journal of Poultry Science
    • /
    • v.47 no.2
    • /
    • pp.107-114
    • /
    • 2020
  • This study was conducted to investigate the variation in nutrient composition of oilseed meals and to develop prediction equations for amino acid concentrations. Energy and nutrient contents were determined in a total of 1,380 feed ingredient samples including copra byproducts, corn distillers, dried grains with solubles, palm kernel byproducts, and soybean meal. The ingredient samples were imported to the Republic of Korea between 2006 and 2015. Data were analyzed using the MIXED procedure of SAS. The regression procedure of SAS was used to generate the prediction equation for the lysine concentration using the crude protein (CP) concentration as an independent variable. The concentrations of moisture, gross energy, CP, ether extract, crude fiber, ash, calcium, phosphorus, lysine, methionine, cysteine, and threonine in tested oilseed meals differed (P<0.05) depending on producing countries. The prediction equations for amino acid concentrations (% as-is basis) in the oilseed meals are: lysine = -1.08 + 0.080 × CP (root mean square error = 0.244, R2 = 0.924, and P<0.001); threonine = -0.297 + 0.044 × CP (root mean square error = 0.099, R2 = 0.958, and P<0.001). In conclusion, energy and nutrient compositions vary in the oilseed meals depending on the producing countries. Moreover, the crude protein concentration can be used as a suitable independent variable for estimating lysine and threonine concentrations in the oilseed meals.

Nonparametric estimation of conditional quantile with censored data (조건부 분위수의 중도절단을 고려한 비모수적 추정)

  • Kim, Eun-Young;Choi, Hyemi
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.2
    • /
    • pp.211-222
    • /
    • 2013
  • We consider the problem of nonparametrically estimating the conditional quantile function from censored data and propose new estimators here. They are based on local logistic regression technique of Lee et al. (2006) and "double-kernel" technique of Yu and Jones (1998) respectively, which are modified versions under random censoring. We compare those with two existing estimators based on a local linear fits using the check function approach. The comparison is done by a simulation study.

Testing of a discontinuity point in the log-variance function based on likelihood (가능도함수를 이용한 로그분산함수의 불연속점 검정)

  • Huh, Jib
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.1
    • /
    • pp.1-9
    • /
    • 2009
  • Let us consider that the variance function in regression model has a discontinuity/change point at unknown location. Yu and Jones (2004) proposed the local polynomial fit to estimate the log-variance function which break the positivity of the variance. Using the local polynomial fit, Huh (2008) estimate the discontinuity point of the log-variance function. We propose a test for the existence of a discontinuity point in the log-variance function with the estimated jump size in Huh (2008). The proposed method is based on the asymptotic distribution of the estimated jump size. Numerical works demonstrate the performance of the method.

  • PDF

Estimation of nonlinear GARCH-M model (비선형 평균 일반화 이분산 자기회귀모형의 추정)

  • Shim, Joo-Yong;Lee, Jang-Taek
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.5
    • /
    • pp.831-839
    • /
    • 2010
  • Least squares support vector machine (LS-SVM) is a kernel trick gaining a lot of popularities in the regression and classification problems. We use LS-SVM to propose a iterative algorithm for a nonlinear generalized autoregressive conditional heteroscedasticity model in the mean (GARCH-M) model to estimate the mean and the conditional volatility of stock market returns. The proposed method combines a weighted LS-SVM for the mean and unweighted LS-SVM for the conditional volatility. In this paper, we show that nonlinear GARCH-M models have a higher performance than the linear GARCH model and the linear GARCH-M model via real data estimations.

Analysis of Certification Effects on Wage and Labor Mobility : Evidence from Craft II Class Certification (자격증이 임금, 노동이동에 미치는 효과: 기능사 2급 자격증을 중심으로)

  • Lee, Sangjun
    • Journal of Labour Economics
    • /
    • v.29 no.2
    • /
    • pp.145-169
    • /
    • 2006
  • This study analyze the effect on wage, labor mobility by using Craft II Class certification out of National skill certification. In this article, we used the parametric and nonparametric method. In the former we used IV that the fraction of certification by occupation by firm scale to solve the selection problem. In the latter, it's used matching method and kernel regression. The paper shows that certification effect on wage has about 5.1~9.9%. The result of analysis between certification and labor mobility indicates better certification effects on long term tenure to the same firm than certification effects on wage from labor mobility. Also, we knew that the employee which have no certification relative is difficult to be established in the same workplace.

  • PDF

SVM을 이용한 지구에 영향을 미치는 Halo CME 예보

  • Choe, Seong-Hwan;Mun, Yong-Jae;Park, Yeong-Deuk
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.38 no.1
    • /
    • pp.61.1-61.1
    • /
    • 2013
  • In this study we apply Support Vector Machine (SVM) to the prediction of geo-effective halo coronal mass ejections (CMEs). The SVM, which is one of machine learning algorithms, is used for the purpose of classification and regression analysis. We use halo and partial halo CMEs from January 1996 to April 2010 in the SOHO/LASCO CME Catalog for training and prediction. And we also use their associated X-ray flare classes to identify front-side halo CMEs (stronger than B1 class), and the Dst index to determine geo-effective halo CMEs (stronger than -50 nT). The combinations of the speed and the angular width of CMEs, and their associated X-ray classes are used for input features of the SVM. We make an attempt to find the best model by using cross-validation which is processed by changing kernel functions of the SVM and their parameters. As a result we obtain statistical parameters for the best model by using the speed of CME and its associated X-ray flare class as input features of the SVM: Accuracy=0.66, PODy=0.76, PODn=0.49, FAR=0.72, Bias=1.06, CSI=0.59, TSS=0.25. The performance of the statistical parameters by applying the SVM is much better than those from the simple classifications based on constant classifiers.

  • PDF