• Title/Summary/Keyword: Stepwise Multiple Regression model

Search Result 242, Processing Time 0.022 seconds

A study on Estimation of NO2 concentration by Statistical model (통계모형을 이용한 NO2 농도 예측에 관한 연구)

  • Jang Nan-Sim
    • Journal of Environmental Science International
    • /
    • v.14 no.11
    • /
    • pp.1049-1056
    • /
    • 2005
  • [ $NO_2$ ] concentration characteristics of Busan metropolitan city was analysed by statistical method using hourly $NO_2$ concentration data$(1998\~2000)$ collected from air quality monitoring sites of the metropolitan city. 4 representative regions were selected among air quality monitoring sites of Ministry of environment. Concentration data of $NO_2$, 5 air pollutants, and data collected at AWS was used. Both Stepwise Multiple Regression model and ARIMA model for prediction of $NO_2$ concentrations were adopted, and then their results were compared with observed concentration. While ARIMA model was useful for the prediction of daily variation of the concentration, it was not satisfactory for the prediction of both rapid variation and seasonal variation of the concentration. Multiple Regression model was better estimated than ARIMA model for prediction of $NO_2$ concentration.

A Multivariate Analysis of Korean Professional Players Salary (한국 프로스포츠 선수들의 연봉에 대한 다변량적 분석)

  • Song, Jong-Woo
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.3
    • /
    • pp.441-453
    • /
    • 2008
  • We analyzed Korean professional basketball and baseball players salary under the assumption that it depends on the personal records and contribution to the team in the previous year. We extensively used data visualization tools to check the relationship among the variables, to find outliers and to do model diagnostics. We used multiple linear regression and regression tree to fit the model and used cross-validation to find an optimal model. We check the relationship between variables carefully and chose a set of variables for the stepwise regression instead of using all variables. We found that points per game, number of assists, number of free throw successes, career are important variables for the basketball players. For the baseball pitchers, career, number of strike-outs per 9 innings, ERA, number of homeruns are important variables. For the baseball hitters, career, number of hits, FA are important variables.

Identifying Factors for Corn Yield Prediction Models and Evaluating Model Selection Methods

  • Chang Jiyul;Clay David E.
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.50 no.4
    • /
    • pp.268-275
    • /
    • 2005
  • Early predictions of crop yields call provide information to producers to take advantages of opportunities into market places, to assess national food security, and to provide early food shortage warning. The objectives of this study were to identify the most useful parameters for estimating yields and to compare two model selection methods for finding the 'best' model developed by multiple linear regression. This research was conducted in two 65ha corn/soybean rotation fields located in east central South Dakota. Data used to develop models were small temporal variability information (STVI: elevation, apparent electrical conductivity $(EC_a)$, slope), large temporal variability information (LTVI : inorganic N, Olsen P, soil moisture), and remote sensing information (green, red, and NIR bands and normalized difference vegetation index (NDVI), green normalized difference vegetation index (GDVI)). Second order Akaike's Information Criterion (AICc) and Stepwise multiple regression were used to develop the best-fitting equations in each system (information groups). The models with $\Delta_i\leq2$ were selected and 22 and 37 models were selected at Moody and Brookings, respectively. Based on the results, the most useful variables to estimate corn yield were different in each field. Elevation and $EC_a$ were consistently the most useful variables in both fields and most of the systems. Model selection was different in each field. Different number of variables were selected in different fields. These results might be contributed to different landscapes and management histories of the study fields. The most common variables selected by AICc and Stepwise were different. In validation, Stepwise was slightly better than AICc at Moody and at Brookings AICc was slightly better than Stepwise. Results suggest that the Alec approach can be used to identify the most useful information and select the 'best' yield models for production fields.

Study on the Critical Storm Duration Decision of the Rivers Basin (중소하천유역의 임계지속시간 결정에 관한 연구)

  • Ahn, Seung-Seop;Lee, Hyeo-Jung;Jung, Do-June
    • Journal of Environmental Science International
    • /
    • v.16 no.11
    • /
    • pp.1301-1312
    • /
    • 2007
  • The objective of this study is to propose a critical storm duration forecasting model on storm runoff in small river basin. The critical storm duration data of 582 sub-basin which introduced disaster impact assessment report on the National Emergency Management Agency during the period from 2004 to 2007 were collected, analyzed and studied. The stepwise multiple regression method are used to establish critical storm duration forecasting models(Linear and exponential type). The results of multiple regression analysis discriminated the linear type more than exponential type. The results of multiple linear regression analysis between the critical storm duration and 5 basin characteristics parameters such as basin area, main stream length, average slope of main stream, shape factor and CN showed more than 0.75 of correlation in terms of the multi correlation coefficient.

Evaluation of Sigumjang Aroma by Stepwise Multiple Regression Analysis of Gas Chromatographic Profiles

  • Choi, Ung-Kyu;Kwon, O-Jun;Lee, Eun-Jeong;Son, Dong-Hwa;Cho, Young-Je;Im, Moo-Hyeog;Chung, Yung-Gun
    • Journal of Microbiology and Biotechnology
    • /
    • v.10 no.4
    • /
    • pp.476-481
    • /
    • 2000
  • A linear correlation, by the stepwise multiple regression analysis, was found between the sensory test of Sigumjang aroma and the gas chromatographic data which were transformed with logarithm. GC data is the most objective method to evaluate Sigumjang aroma. A multiple correlation coefficient and a determination coefficient of more than 0.9 were obtained at the 9th and 13th steps, respectively. At step 31, the coefficient of determination level of 0.95 was attained. The accuracy of its estimation became higher as the number of the variables entered into the regression model increased. Over 90% of the Sigumjang aroma was explained by 13 compounds indentified on GC. The contributing proportion of the peak 26 was the highest followed by peaks 57 (9.27%), 29 (7.51%), 54 (6.01%), 8 (5.99%), 49 (4.97%), and 13 (4.11%).

  • PDF

Relationship between Aiming Patterns and Scores in Archery Shooting

  • Quan, ChengHao;Lee, Sangmin
    • Korean Journal of Applied Biomechanics
    • /
    • v.26 no.4
    • /
    • pp.353-360
    • /
    • 2016
  • Objective: The aim of this study was to investigate the relationship between aiming patterns and scores in archery shooting. Method: Four (N = 4) elementary-level archers from middle school participated in this study. Aiming pattern was defined by averaged acceleration data measured from accelerometers attached on the body during the aiming phase in archery shooting. Stepwise multiple regression analysis was used to test whether a model incorporating aiming patterns from all nine accelerometers could predict the scores. In order to extract period of interest (POI) data from raw data, a Dynamic Time Warping (DTW)-based extraction method was presented. Results: Regression models for all four subjects are conducted with different significance levels and variables. The significance levels of the regression models are 0.12%, 1.61%, 0.55%, and 0.4% respectively; the $R^2$ of the regression models is 64.04%, 27.93%, 72.02%, and 45.62% respectively; and the maximum significance levels of parameters in the regression models are 1.26%, 4.58%, 5.1%, and 4.98% respectively. Conclusion: Our results indicated that the relationship between aiming patterns and scores was described by a regression model. Analysis of the significance levels, variables, and parameters of the regression model showed that our approach - regression analysis with DTW - is an effective way to raise scores in archery shooting.

Evaluating Variable Selection Techniques for Multivariate Linear Regression (다중선형회귀모형에서의 변수선택기법 평가)

  • Ryu, Nahyeon;Kim, Hyungseok;Kang, Pilsung
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.42 no.5
    • /
    • pp.314-326
    • /
    • 2016
  • The purpose of variable selection techniques is to select a subset of relevant variables for a particular learning algorithm in order to improve the accuracy of prediction model and improve the efficiency of the model. We conduct an empirical analysis to evaluate and compare seven well-known variable selection techniques for multiple linear regression model, which is one of the most commonly used regression model in practice. The variable selection techniques we apply are forward selection, backward elimination, stepwise selection, genetic algorithm (GA), ridge regression, lasso (Least Absolute Shrinkage and Selection Operator) and elastic net. Based on the experiment with 49 regression data sets, it is found that GA resulted in the lowest error rates while lasso most significantly reduces the number of variables. In terms of computational efficiency, forward/backward elimination and lasso requires less time than the other techniques.

The Longitudinal Study of Diet and Sexual Maturity as a Determinant of Obesity for Adolescents

  • Young-Ok Kim;Yoon-Sun Choi
    • Korean Journal of Community Nutrition
    • /
    • v.3 no.5
    • /
    • pp.679-684
    • /
    • 1998
  • This study was conducted to investigate the determinants of obesity during adolescnece. A total of 726 adolescents living in rural areas in Korea had been observed for four years from 1992 to 1996 regarding their diet, sexual maturity, blood profile and physical growth. Stepwise multiple regression analysis was used to identify priorities fo the importance between the factors influencing obesity. The average nutrient intake over the three year period was higher than that of the Korean Recommended Dietary Allowances. The prevalence of obesity for the subjects based on BMI was 9.5%. Results of the stepwise multiple regression analysis showed that blood components and sexual maturity were more significant factors for determining the obesity than the dietary factors. The result may suggest that to understand obesity in children it is necessary to develop on analytical model for the children rather than using the existing analytical model developed mostly for adult patients of obesity. The model should include a wide range of variables such as diet, sexual maturity and changes in blood.

  • PDF

Parameter Calibration of Storage Function Model and Flood Forecasting (2) Comparative Study on the Flood Forecasting Methods (저류함수모형의 매개변수 보정과 홍수예측 (2) 홍수예측방법의 비교 연구)

  • Kim, Bum Jun;Song, Jae Hyun;Kim, Hung Soo;Hong, Il Pyo
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.26 no.1B
    • /
    • pp.39-50
    • /
    • 2006
  • The flood control offices of main rivers have used a storage function model to forecast flood stage in Korea and studies of flood forecasting actively have been done even now. On this account, the storage function model, which is used in flood control office, regression models and artificial neural network model are applied into flood forecasting of study watershed in this paper. The result obtained by each method are analyzed for the comparative study. In case of storage function model, this paper uses the representative parameters of the flood control offices and the optimized parameters. Regression coefficients are obtained by regression analysis and neural network is trained by backpropagation algorithm after selecting four events between 1995 to 2001. As a result of this study, it is shown that the optimized parameters are superior to the representative parameters for flood forecasting. The results obtained by multiple, robust, stepwise regression analysis, one of the regression methods, show very good forecasts. Although the artificial neural network model shows less exact results than the regression model, it can be efficient way to produce a good forecasts.

A Study on Developing the Performance Evaluation Indicators of Defense R&D Test Development Projects (국방연구개발 시험개발사업 성과평가지표 개발에 관한 연구)

  • Lee, Hyung-Jun;Kim, Woo-Je;Kim, Chan-Soo
    • IE interfaces
    • /
    • v.23 no.1
    • /
    • pp.78-88
    • /
    • 2010
  • In this paper we develop a model for the performance evaluation of defense R&D test development projects based on analytic hierarchy process. First, evaluation indicators are collected through the related literature survey and a delphi inquiry method. Second, stepwise multiple linear regression is used for developing a hierarchical structure for analytic hierarchy process in the evaluation model, which can make the selected evaluation indicators of the hierarchical structure independent. Also we verify the effectiveness of proposed indicators of the performance evaluation by comparing with the existing evaluation indicators. The developed indicators for the performance evaluation is more reasonable and practical than the previous indicators on defense R&D test development projects.