Search | Korea Science

Determination of Research Octane Number using NIR Spectral Data and Ridge Regression

Jeong, Ho Il;Lee, Hye Seon;Jeon, Ji Hyeok
- Bulletin of the Korean Chemical Society
- /
- v.22 no.1
- /
- pp.37-42
- /
- 2001
Ridge regression is compared with multiple linear regression (MLR) for determination of Research Octane Number (RON) when the baseline and signal-to-noise ratio are varied. MLR analysis of near-infrared (NIR) spectroscopic data usually encounters a collinearity problem, which adversely affects long-term prediction performance. The collinearity problem can be eliminated or greatly improved by using ridge regression, which is a biased estimation method. To evaluate the robustness of each calibration, the calibration models developed by both calibration methods were used to predict RONs of gasoline spectra in which the baseline and signal-to-noise ratio were varied. The prediction results of a ridge calibration model showed more stable prediction performance as compared to that of MLR, especially when the spectral baselines were varied. . In conclusion, ridge regression is shown to be a viable method for calibration of RON with the NIR data when only a few wavelengths are available such as hand-carry device using a few diodes.
https://doi.org/10.5012/bkcs.2001.22.1.37 인용 PDF

A numerical study on group quantile regression models

Kim, Doyoen;Jung, Yoonsuh
- Communications for Statistical Applications and Methods
- /
- v.26 no.4
- /
- pp.359-370
- /
- 2019
Grouping structures in covariates are often ignored in regression models. Recent statistical developments considering grouping structure shows clear advantages; however, reflecting the grouping structure on the quantile regression model has been relatively rare in the literature. Treating the grouping structure is usually conducted by employing a group penalty. In this work, we explore the idea of group penalty to the quantile regression models. The grouping structure is assumed to be known, which is commonly true for some cases. For example, group of dummy variables transformed from one categorical variable can be regarded as one group of covariates. We examine the group quantile regression models via two real data analyses and simulation studies that reveal the beneficial performance of group quantile regression models to the non-group version methods if there exists grouping structures among variables.
https://doi.org/10.29220/CSAM.2019.26.4.359 인용 PDF KSCI

Fused inverse regression with multi-dimensional responses

Cho, Youyoung;Han, Hyoseon;Yoo, Jae Keun
- Communications for Statistical Applications and Methods
- /
- v.28 no.3
- /
- pp.267-279
- /
- 2021
A regression with multi-dimensional responses is quite common nowadays in the so-called big data era. In such regression, to relieve the curse of dimension due to high-dimension of responses, the dimension reduction of predictors is essential in analysis. Sufficient dimension reduction provides effective tools for the reduction, but there are few sufficient dimension reduction methodologies for multivariate regression. To fill this gap, we newly propose two fused slice-based inverse regression methods. The proposed approaches are robust to the numbers of clusters or slices and improve the estimation results over existing methods by fusing many kernel matrices. Numerical studies are presented and are compared with existing methods. Real data analysis confirms practical usefulness of the proposed methods.
https://doi.org/10.29220/CSAM.2021.28.3.267 인용 PDF KSCI

An Approach to Applying Multiple Linear Regression Models by Interlacing Data in Classifying Similar Software

Lim, Hyun-il
- Journal of Information Processing Systems
- /
- v.18 no.2
- /
- pp.268-281
- /
- 2022
The development of information technology is bringing many changes to everyday life, and machine learning can be used as a technique to solve a wide range of real-world problems. Analysis and utilization of data are essential processes in applying machine learning to real-world problems. As a method of processing data in machine learning, we propose an approach based on applying multiple linear regression models by interlacing data to the task of classifying similar software. Linear regression is widely used in estimation problems to model the relationship between input and output data. In our approach, multiple linear regression models are generated by training on interlaced feature data. A combination of these multiple models is then used as the prediction model for classifying similar software. Experiments are performed to evaluate the proposed approach as compared to conventional linear regression, and the experimental results show that the proposed method classifies similar software more accurately than the conventional model. We anticipate the proposed approach to be applied to various kinds of classification problems to improve the accuracy of conventional linear regression.
https://doi.org/10.3745/JIPS.04.0241 인용 PDF KSCI

Selecting Machine Learning Model Based on Natural Language Processing for Shanghanlun Diagnostic System Classification (자연어 처리 기반 『상한론(傷寒論)』 변병진단체계(辨病診斷體系) 분류를 위한 기계학습 모델 선정)

Young-Nam Kim
- 대한상한금궤의학회지
- /
- v.14 no.1
- /
- pp.41-50
- /
- 2022
Objective : The purpose of this study is to explore the most suitable machine learning model algorithm for Shanghanlun diagnostic system classification using natural language processing (NLP). Methods : A total of 201 data items were collected from 『Shanghanlun』 and 『Clinical Shanghanlun』, 'Taeyangbyeong-gyeolhyung' and 'Eumyangyeokchahunobokbyeong' were excluded to prevent oversampling or undersampling. Data were pretreated using a twitter Korean tokenizer and trained by logistic regression, ridge regression, lasso regression, naive bayes classifier, decision tree, and random forest algorithms. The accuracy of the models were compared. Results : As a result of machine learning, ridge regression and naive Bayes classifier showed an accuracy of 0.843, logistic regression and random forest showed an accuracy of 0.804, and decision tree showed an accuracy of 0.745, while lasso regression showed an accuracy of 0.608. Conclusions : Ridge regression and naive Bayes classifier are suitable NLP machine learning models for the Shanghanlun diagnostic system classification.
PDF

Comparative Study on Imputation Procedures in Exponential Regression Model with missing values

Park, Young-Sool;Kim, Soon-Kwi
- Journal of the Korean Data and Information Science Society
- /
- v.14 no.2
- /
- pp.143-152
- /
- 2003
A data set having missing observations is often completed by using imputed values. In this paper, performances and accuracy of five imputation procedures are evaluated when missing values exist only on the response variable in the exponential regression model. Our simulation results show that adjusted exponential regression imputation procedure can be well used to compensate for missing data, in particular, compared to other imputation procedures. An illustrative example using real data is provided.
PDF

Prediction Intervals for LS-SVM Regression using the Bootstrap

Shim, Joo-Yong;Hwang, Chang-Ha
- Journal of the Korean Data and Information Science Society
- /
- v.14 no.2
- /
- pp.337-343
- /
- 2003
In this paper we present the prediction interval estimation method using bootstrap method for least squares support vector machine(LS-SVM) regression, which allows us to perform even nonlinear regression by constructing a linear regression function in a high dimensional feature space. The bootstrap method is applied to generate the bootstrap sample for estimation of the covariance of the regression parameters consisting of the optimal bias and Lagrange multipliers. Experimental results are then presented which indicate the performance of this algorithm.
PDF

Iterative Support Vector Quantile Regression for Censored Data

Shim, Joo-Yong;Hong, Dug-Hun;Kim, Dal-Ho;Hwang, Chang-Ha
- Communications for Statistical Applications and Methods
- /
- v.14 no.1
- /
- pp.195-203
- /
- 2007
In this paper we propose support vector quantile regression (SVQR) for randomly right censored data. The proposed procedure basically utilizes iterative method based on the empirical distribution functions of the censored times and the sample quantiles of the observed variables, and applies support vector regression for the estimation of the quantile function. Experimental results we then presented to indicate the performance of the proposed procedure.
https://doi.org/10.5351/CKSS.2007.14.1.195 인용 PDF KSCI

Bayesian Estimation for the Multiple Regression with Censored Data : Mutivariate Normal Error Terms

Yoon, Yong-Hwa
- Journal of the Korean Data and Information Science Society
- /
- v.9 no.2
- /
- pp.165-172
- /
- 1998
This paper considers a linear regression model with censored data where each error term follows a multivariate normal distribution. In this paper we consider the diffuse prior distribution for parameters of the linear regression model. With censored data we derive the full conditional densities for parameters of a multiple regression model in order to obtain the marginal posterior densities of the relevant parameters through the Gibbs Sampler, which was proposed by Geman and Geman(1984) and utilized by Gelfand and Smith(1990) with statistical viewpoint.
PDF

Change-Points with Jump in Nonparametric Regression Functions

Kim, Jong-Tae
- 한국데이터정보과학회:학술대회논문집
- /
- 2005.04a
- /
- pp.193-199
- /
- 2005
A simple method is proposed to detect the number of change points with jump discontinuities in nonparamteric regression functions. The proposed estimators are based on a local linear regression fit by the comparison of left and right one-side kernel smoother. Also, the proposed methodology is suggested as the test statistic for detecting of change points and the direction of jump discontinuities.
PDF

Search Result 35,284, Processing Time 0.047 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)