• 제목/요약/키워드: Linear Regression Fit

검색결과 138건 처리시간 0.029초

A note on standardization in penalized regressions

  • Lee, Sangin
    • Journal of the Korean Data and Information Science Society
    • /
    • 제26권2호
    • /
    • pp.505-516
    • /
    • 2015
  • We consider sparse high-dimensional linear regression models. Penalized regressions have been used as effective methods for variable selection and estimation in high-dimensional models. In penalized regressions, it is common practice to standardize variables before fitting a penalized model and then fit a penalized model with standardized variables. Finally, the estimated coefficients from a penalized model are recovered to the scale on original variables. However, these procedures produce a slightly different solution compared to the corresponding original penalized problem. In this paper, we investigate issues on the standardization of variables in penalized regressions and formulate the definition of the standardized penalized estimator. In addition, we compare the original penalized estimator with the standardized penalized estimator through simulation studies and real data analysis.

회귀선에 의한 국내 지점 확률항우량산정에 관한 연구 (서울, 대구, 목포 지점을 중심으로) (A Study on the Determination of Point Probability Rainfall-Depth in Korea by the LinearLeast Squares method (Seoul, Daegu and Mokpo))

  • 이원환;김재한
    • 물과 미래
    • /
    • 제9권1호
    • /
    • pp.81-85
    • /
    • 1976
  • 본 연구는 서울, 대구 및 목포지점의 확률항우량을 회귀선에 의하여 손쉽게 구하고자 유도제시하였다. 재현기간과 10분에서부터 120분까지 각각의 단시간 확률항우량 관계를 직선식으로 유도하였으며 그 직선으로부터 확률항우량을 직접 구할 수 있는 해석적인 방법을 고찰하였다. 연구결과에 의하면 두 변수사이에는 상당한 관계가 있음을 보여줬으며 적절한 변수변환을 시도한다면 세 지점이외 다른 지점도 적용이 가능하리라 사료된다.

  • PDF

비선형 회귀분석을 이용한 Generic 데이터 기반의 누출빈도함수 추정 (Estimation of Leak Frequency Function by Application of Non-linear Regression Analysis to Generic Data)

  • 윤익근;단승규;정호진;홍성경
    • 한국안전학회지
    • /
    • 제35권5호
    • /
    • pp.15-21
    • /
    • 2020
  • Quantitative risk assessment (QRA) is used as a legal or voluntary safety management tool for the hazardous material industry and the utilization of the method is gradually increasing. Therefore, a leak frequency analysis based on reliable generic data is a critical element in the evolution of QRA and safety technologies. The aim of this paper is to derive the leak frequency function that can be applied more flexibly in QRA based on OGP report with high reliability and global utilization. For the purpose, we first reviewed the data on the 16 equipments included in the OGP report and selected the predictors. And then we found good equations to fit the OGP data using non-linear regression analysis. The various expectation functions were applied to search for suitable parameter to serve as a meaningful reference in the future. The results of this analysis show that the best fitting parameter is found in the form of DNV function and connection function in natural logarithm. In conclusion, the average percentage error between the fitted and the original value is very small as 3 %, so the derived prediction function can be applicable in the quantitative frequency analysis. This study is to contribute to expand the applicability of QRA and advance safety engineering as providing the generic equations for practical leak frequency analysis.

A Climate Prediction Method Based on EMD and Ensemble Prediction Technique

  • Bi, Shuoben;Bi, Shengjie;Chen, Xuan;Ji, Han;Lu, Ying
    • Asia-Pacific Journal of Atmospheric Sciences
    • /
    • 제54권4호
    • /
    • pp.611-622
    • /
    • 2018
  • Observed climate data are processed under the assumption that their time series are stationary, as in multi-step temperature and precipitation prediction, which usually leads to low prediction accuracy. If a climate system model is based on a single prediction model, the prediction results contain significant uncertainty. In order to overcome this drawback, this study uses a method that integrates ensemble prediction and a stepwise regression model based on a mean-valued generation function. In addition, it utilizes empirical mode decomposition (EMD), which is a new method of handling time series. First, a non-stationary time series is decomposed into a series of intrinsic mode functions (IMFs), which are stationary and multi-scale. Then, a different prediction model is constructed for each component of the IMF using numerical ensemble prediction combined with stepwise regression analysis. Finally, the results are fit to a linear regression model, and a short-term climate prediction system is established using the Visual Studio development platform. The model is validated using temperature data from February 1957 to 2005 from 88 weather stations in Guangxi, China. The results show that compared to single-model prediction methods, the EMD and ensemble prediction model is more effective for forecasting climate change and abrupt climate shifts when using historical data for multi-step prediction.

불연속 로그분산함수의 커널추정량들의 비교 연구 (Comparison study on kernel type estimators of discontinuous log-variance)

  • 허집
    • Journal of the Korean Data and Information Science Society
    • /
    • 제25권1호
    • /
    • pp.87-95
    • /
    • 2014
  • 분산함수가 불연속인 경우 Kang과 Huh (2006)는 잔차제곱을 이용한 Nadaraya-Watson 추정량으로 분산함수를 추정하였다. 음의 실수 값도 가질 수 있는 로그분산함수를 추정 대상으로 하여, 오차제곱의 분포를 ${\chi}^2$-분포로 가정하고 국소선형적합을 이용한 불연속 로그분산함수의 추정이 Huh(2013)에 의해 연구되었다. Chen 등 (2009)은 연속인 로그분산함수를 로그잔차제곱을 이용한 국소선형적합으로 추정하였다. 본 연구는 Chen 등의 추정법을 이용하여 불연속인 로그분산함수의 추정량을 제시하였다. 기존의 제안된 불연속인 로그분산함수의 추정량들과 제안된 추정량을 모의실험을 통하여 비교연구하고자 한다. 한편, 로그분산함수가 연속이지만 그 미분된 함수가 불연속일 경우, Huh (2013)의 방법과 제안된 방법으로 적합된 국소선형의 기울기를 이용하여 불연속인 미분된 로그 분산함수의 추정량을 제시하고자 한다. 이들 추정량의 비교 연구 또한 모의실험을 통하여 제시하고자 한다.

Validity for Use of Non-HDL Cholesterol Rather than LDL Cholesterol

  • Kwon, Se-Young;Na, Young-Ak
    • 대한임상검사과학회지
    • /
    • 제45권2호
    • /
    • pp.54-59
    • /
    • 2013
  • NonHDL cholesterol values have been suggested as a risk marker for cardiovascular disease. NonHDL cholesterol values were calculated, using a very simple measurement [nonHDL cholesterol=serum total cholesterol-HDL cholesterol]. This formula is very useful as a screening tool for identifying dyslipoproteinemias, risk assessment, and assessing the results of hypolipidemic therapy. The data from the 2009 Korean National Health and Nutrition Examination Survey were used. Analysis was done for 1,992 subjects with lipid panels (Cholesterol, HDL, LDLdirect and Triglycerides) results. We studied the relationship between nonHDL cholesterol and LDL cholesterol. As a result, nonHDL cholesterol values were plotted against the LDL direct and calculated values. The linear regression equation for nonHDL cholesterol and direct LDL cholesterol was $nonHDLchol=23.60+1.03{\times}LDLdirect$ (p<0.0001, $r^2=0.80$) in all subjects. The subjects were classified into triglyceride values. When triglycerides are below 400 mg/dL, the linear fit to LDL direct is found to be $[nonHDLchol=17.34+1.07{\times}LDLdirect]$ (p<0.0001, $r^2=0.88$) and to the Friedewald LDL calculation is $[nonHDLchol=23.10+1.02{\times}LDLcalc]$ (p<0.0001, $r^2=0.82$). For triglycerides above 400 mg/dL, the linear fit equation is $[nonHDLchol=87.57+0.92{\times}LDLdirect]$ (p<0.0001, $r^2=0.50$) and to the LDL calculated, it is $[nonHDLchol=142.70+0.50{\times}LDLcalc]$ (p<0.0001, $r^2=0.32$). This study provides examples of the utility of nonHDL cholesterol concentrations in clinical medicine.

  • PDF

Predicting standardized ileal digestibility of lysine in full-fat soybeans using chemical composition and physical characteristics

  • Chanwit Kaewtapee;Rainer Mosenthin
    • Animal Bioscience
    • /
    • 제37권6호
    • /
    • pp.1077-1084
    • /
    • 2024
  • Objective: The present work was conducted to evaluate suitable variables and develop prediction equations using chemical composition and physical characteristics for estimating standardized ileal digestibility (SID) of lysine (Lys) in full-fat soybeans (FFSB). Methods: The chemical composition and physical characteristics were determined including trypsin inhibitor activity (TIA), urease activity (UA), protein solubility in 0.2% potassium hydroxide (KOH), protein dispersibility index (PDI), lysine to crude protein ratio (Lys:CP), reactive Lys:CP ratio, neutral detergent fiber, neutral detergent insoluble nitrogen (NDIN), acid detergent insoluble nitrogen (ADIN), acid detergent fiber, L* (lightness), and a* (redness). Pearson's correlation (r) was computed, and the relationship between variables was determined by linear or quadratic regression. Stepwise multiple regression was performed to develop prediction equations for SID of Lys. Results: Negative correlations (p<0.01) between SID of Lys and protein quality indicators were observed for TIA (r = -0.80), PDI (r = -0.80), and UA (r = -0.76). The SID of Lys also showed a quadratic response (p<0.01) to UA, NDIN, TIA, L*, KOH, a* and Lys:CP. The best-fit model for predicting SID of Lys in FFSB included TIA, UA, NDIN, and ADIN, resulting in the highest coefficient of determination (R2 = 0.94). Conclusion: Quadratic regression with one variable indicated the high accuracy for UA, NDIN, TIA, and PDI. The multiple linear regression including TIA, UA, NDIN, and ADIN is an alternative model used to predict SID of Lys in FFSB to improve the accuracy. Therefore, multiple indicators are warranted to assess either insufficient or excessive heat treatment accurately, which can be employed by the feed industry as measures for quality control purposes to predict SID of Lys in FFSB.

회귀분석을 이용한 주묘 위험성 평가 입력요소 결정에 관한 연구 (Determining Input Values for Dragging Anchor Assessments Using Regression Analysis)

  • 강병선;정창현
    • 해양환경안전학회지
    • /
    • 제27권6호
    • /
    • pp.822-831
    • /
    • 2021
  • 선박의 주묘 위험성을 평가할 수 있는 프로그램이 개발되어 있지만 선박의 제원에 해당되는 다양한 입력요소들을 직접 찾아서 입력해야 하므로 VTS 관제사가 정박지에 정박 중인 선박들로부터 이러한 입력요소들을 모두 확인하여 프로그램을 활용하는 것은 현실적으로 어려운 상황이다. 이에 본 연구에서는 VTS 관제사 입장에서 선박으로부터 쉽게 획득할 수 있는 총톤수(GT)를 독립변수로 설정하고 프로그램 입력요소들을 종속변수로 하여 선형 및 비선형 회귀분석을 실시하였다. 다항식 모델(선형)과 멱급수 모델(비선형)의 적합도를 비교한 결과, 컨테이너선과 벌크선의 경우에는 모든 입력요소에서 멱수급 모델이 적합한 것으로 평가되었다. 하지만 탱커선의 경우에는 수선간장, 선폭, 흘수는 멱수급 모델이 적합하고, 정면풍압면적, 앵커의 무게, 의장수, 묘쇄공으로부터 선저까지의 높이는 다항식 모델이 더 적합한 것으로 평가되었다. 또한 탱커선의 정면풍압면적 요소를 제외한 다른 나머지 종속변수들은 모두 결정계수가 0.7 이상으로 높은 적합도를 보였다. 따라서 주묘 위험성 평가 프로그램의 입력요소 중 외력 요소, 해저 저질, 수심 및 앵커 체인의 신출량을 제외한 나머지 입력요소들은 선박의 총톤수만 입력하면 회귀분석 모델식에 의해 자동으로 입력됨으로써 주묘 위험성 평가가 가능할 것으로 판단된다.

An empirical bracketed duration relation for stable continental regions of North America

  • Lee, Jongwon;Green, Russell A.
    • Earthquakes and Structures
    • /
    • 제3권1호
    • /
    • pp.1-15
    • /
    • 2012
  • An empirical predictive relationship correlating bracketed duration to earthquake magnitude, site-to-source distance, and local site conditions (i.e. rock vs. stiff soil) for stable continental regions of North America is presented herein. The correlation was developed from data from 620 horizontal motions for central and eastern North America (CENA), consisting of 28 recorded motions and 592 scaled motions. The bracketed duration data was comprised of nonzero and zero durations. The non-linear mixed-effects regression technique was used to fit a predictive model to the nonzero duration data. To account for the zero duration data, logistic regression was conducted to model the probability of zero duration occurrences. Then, the probability models were applied as weighting functions to the NLME regression results. Comparing the bracketed durations for CENA motions with those from active shallow crustal regions (e.g. western North America: WNA), the motions in CENA have longer bracketed durations than those in the WNA. Especially for larger magnitudes at far distances, the bracketed durations in CENA tend to be significantly longer than those in WNA.

Semi-rigid connection modeling for steel frameworks

  • Liu, Yuxin
    • Structural Engineering and Mechanics
    • /
    • 제35권4호
    • /
    • pp.431-457
    • /
    • 2010
  • This article provides a discussion of the mathematic modeling of connections for designing and qualifying structures, systems, and components subject to monotonic or cyclic loading. To characterize the force-deformation behavior of connections under monotonic loading, a review of the Ramberg-Osgood, Richard-Abbott, and Menegotto-Pinto models is conducted, and it is shown that these nonlinear functions can be mathematically derived by scaling up or down a linear force-deformation function. A generalized four-parameter model for simulating connection behavior is investigated to facilitate nonlinear regression analysis. In order to perform seismic analysis of frameworks, a hysteretic model accounting for loading, unloading, and reloading is described using the established monotonic model. For preliminary analysis, a method is provided to quickly determine the model parameters that fit approximately with the observed data. To reach more accurate values of the parameters, the methods of nonlinear regression analysis are investigated and the modified Levenberg-Marquardt and separable nonlinear least-square algorithms are applied in determining the model parameters. Example case studies illustrate the procedure for the computation through the use of experimental/analytical data taken form the literature. Transformation of connection curves from the three-parameter model to the four-parameter model for structural analysis is conducted based on the modeling of connections subject to fire.