• Title/Summary/Keyword: regression analysis.

Search Result 23,697, Processing Time 0.042 seconds

Application of Regression Analysis Model to TOC Concentration Estimation - Osu Stream Watershed - (회귀분석에 의한 TOC 농도 추정 - 오수천 유역을 대상으로 -)

  • Park, Jinhwan;Moon, Myungjin;Han, Sungwook;Lee, Hyungjin;Jung, Soojung;Hwang, Kyungsup;Kim, Kapsoon
    • Journal of Environmental Impact Assessment
    • /
    • v.23 no.3
    • /
    • pp.187-196
    • /
    • 2014
  • The objective of this study is to evaluate and analyze Osu stream watershed water environment system. The data were collected from January 2009 to December 2011 including water temperature, pH, DO, EC, BOD, COD, TOC, SS, T-N, T-P and discharge. The data were used for principle component analysis and factor analysis. The results are as followes. The primary factors obtained from both the principal component analysis and the factor analysis were BOD, COD, TOC, SS and T-P. Once principal component analysis and factor analysis have been performed with the collected data and then the results will be applied to both simple regression model and multiple regression model. The regression model was developed into case 1 using concentrations of water quality parameters and case 2 using delivery loads. The value of the coefficient of determination on case 1 fell between 0.629 and 0.866; this was lower than case 2 value which fell between 0.946 and 0.998. Therefore, case 2 model would be a reliable choice.The coefficient of determination between the estimated figure using data which was developed to the regression model in 2012 and the actual measurement value was over 0.6, overall. It can be safely deduced that the correlation value between the two findings was high. The same model can be applied to get TOC concentrations in future.

Development of MS Excel Macros to estimate regression models and test hypotheses of relationships between variables (Application to regression analysis of subway electric charges data) (MS Excel 함수들을 이용한 회귀 분석 모형 추정 및 관계 분석 검정을 위한 매크로 개발 (지하철 전기요금 자료 회귀분석에 응용))

  • Kim, Sook-Young
    • Journal of the Korea Computer Industry Society
    • /
    • v.10 no.5
    • /
    • pp.213-220
    • /
    • 2009
  • Regression analysis to estimate the fitted models and test hypotheses are basic statistical tools for survey data as well as experimental data. Data is collected as pairs of independent and dependent variables, and statistics are computed using matrix calculation. To estimate a best fitted model is a key to maximize reliability of regression analysis. To fit a regression model, plot data on XY axis and select the most fitted models. Researchers estimate the best model and test hypothesis with MS Excel's graph menu and matrix computation functions. In this study, I develop macros to estimate the fitted regression model and test hypotheses of relationship between variables. Subway electric charges data with one dependent variable and three independent variables are tested using developed macros, and compared with the results using built-in Excel of regression analysis.

  • PDF

Comparison of Linear and Nonlinear Regressions and Elements Analysis for Wind Speed Prediction (풍속 예측을 위한 선형회귀분석과 비선형회귀분석 기법의 비교 및 인자분석)

  • Kim, Dongyeon;Seo, Kisung
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.25 no.5
    • /
    • pp.477-482
    • /
    • 2015
  • Linear regressions and evolutionary nonlinear regression based compensation techniques for the short-range prediction of wind speed are investigated. Development of an efficient MOS(Model Output Statistics) is necessary to correct systematic errors of the model, but a linear regression based MOS is hard to manage an irregular nature of weather prediction. In order to solve the problem, a nonlinear and symbolic regression method using GP(Genetic Programming) is suggested for a development of MOS for wind speed prediction. The proposed method is compared to various linear regression methods for prediction of wind speed. Also, statistical analysis of distribution for UM elements for each method is executed. experiments are performed for KLAPS(Korea Local Analysis and Prediction System) re-analysis data from 2007 to 2013 year for Jeju Island and Busan area in South Korea.

Regionalized Regression Model for Monthly Streamflow in Korean Watersheds (韓國河川의 月 流出量 推定을 위한 地域化 回歸模型)

  • Kim, Tai-Cheol;Park, Sung-Woo
    • Magazine of the Korean Society of Agricultural Engineers
    • /
    • v.26 no.2
    • /
    • pp.106-124
    • /
    • 1984
  • Monthly streanflow of watersheds is one of the most important elements for the planning, design, and management of water resources development projects, e.g., determination of storage requirement of reservoirs and control of release-water in lowflow rivers. Modeling of longterm runoff is theoretically based on water-balance analysis for a certain time interval. The effect of the casual factors of rainfall, evaporation, and soil-moisture storage on streamflow might be explained by multiple regression analysis. Using the basic concepts of water-balance and regression analysis, it was possible to develop a generalized model called the Regionalized Regression Model for Monthly Streamflow in Korean Watersheds. Based on model verification, it is felt that the model can be reliably applied to any proposed station in Korean watersheds to estimate monthly streamflow for the planning, design, and management of water resources development projects, especially those involving irrigation. Modeling processes and properties are summarized as follows; 1. From a simplified equation of water-balance on a watershed a regression model for monthly streamflow using the variables of rainfall, pan evaporation, and previous-month streamflow was formulated. 2. The hydrologic response of a watershed was represented lumpedly, qualitatively, and deductively using the regression coefficients of the water-balance regression model. 3. Regionalization was carried out to classify 33 watersheds on the basis of similarity through cluster analysis and resulted in 4 regional groups. 4. Prediction equations for the regional coefficients were derived from the stepwise regression analysis of watershed characteristics. It was also possible to explain geographic influences on streamflow through those prediction equations. 5. A model requiring the simple input of the data for rainfall, pan evaporation, and geographic factors was developed to estimate monthly streamflow at ungaged stations. The results of evaluating the performance of the model generally satisfactory.

  • PDF

Fused inverse regression with multi-dimensional responses

  • Cho, Youyoung;Han, Hyoseon;Yoo, Jae Keun
    • Communications for Statistical Applications and Methods
    • /
    • v.28 no.3
    • /
    • pp.267-279
    • /
    • 2021
  • A regression with multi-dimensional responses is quite common nowadays in the so-called big data era. In such regression, to relieve the curse of dimension due to high-dimension of responses, the dimension reduction of predictors is essential in analysis. Sufficient dimension reduction provides effective tools for the reduction, but there are few sufficient dimension reduction methodologies for multivariate regression. To fill this gap, we newly propose two fused slice-based inverse regression methods. The proposed approaches are robust to the numbers of clusters or slices and improve the estimation results over existing methods by fusing many kernel matrices. Numerical studies are presented and are compared with existing methods. Real data analysis confirms practical usefulness of the proposed methods.

An Approach to Applying Multiple Linear Regression Models by Interlacing Data in Classifying Similar Software

  • Lim, Hyun-il
    • Journal of Information Processing Systems
    • /
    • v.18 no.2
    • /
    • pp.268-281
    • /
    • 2022
  • The development of information technology is bringing many changes to everyday life, and machine learning can be used as a technique to solve a wide range of real-world problems. Analysis and utilization of data are essential processes in applying machine learning to real-world problems. As a method of processing data in machine learning, we propose an approach based on applying multiple linear regression models by interlacing data to the task of classifying similar software. Linear regression is widely used in estimation problems to model the relationship between input and output data. In our approach, multiple linear regression models are generated by training on interlaced feature data. A combination of these multiple models is then used as the prediction model for classifying similar software. Experiments are performed to evaluate the proposed approach as compared to conventional linear regression, and the experimental results show that the proposed method classifies similar software more accurately than the conventional model. We anticipate the proposed approach to be applied to various kinds of classification problems to improve the accuracy of conventional linear regression.

Comparing Risk-adjusted In-hospital Mortality for Craniotomies : Logistic Regression versus Multilevel Analysis (로지스틱 회귀분석과 다수준 분석을 이용한 Craniotomy 환자의 사망률 평가결과의 일치도 분석)

  • Kim, Sun-Hee;Lee, Kwang-Soo
    • The Korean Journal of Health Service Management
    • /
    • v.9 no.2
    • /
    • pp.81-88
    • /
    • 2015
  • The purpose of this study was to compare the risk-adjusted in-hospital mortality for craniotomies between logistic regression and multilevel analysis. By using patient sample data from the Health Insurance Review & Assessment Service, in-patients with a craniotomy were selected as the survey target. The sample data were collected from a total number of 2,335 patients from 90 hospitals. The sample data were analyzed with SAS 9.3. From the results of the existing logistic regression analysis and multilevel analysis, the values from the multilevel analysis represented a better model than that of logistic regression. The intra-class correlation (ICC) was 18.0%. It was found that risk-adjusted in-hospital mortality for craniotomies may vary in every hospital. The agreement by kappa coefficient between the two methods was good for the risk-adjusted in-hospital mortality for craniotomies, but the factors influencing the outcome for that were different.

DD-Plot for ANCOVA Models (ANCOVA 모형을 위한 DD-plot)

  • Jang, Dae-Heung
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.2
    • /
    • pp.227-237
    • /
    • 2014
  • We use the regression model with the indicator variables in the case that we use qualitative variables as some predictor variables in regression analysis. We use the ANCOVA(Analysis of Covariance) model when comparing the response variable among groups while statistically controlling for variation in the response variable caused by a variation in the covariate. DD-plot can be used as a graphical exploratory data analysis tool before the confirmatory data analysis. With the DD-plot, we can discriminate the difference of groups in the regression model with the indicator variables or the ANCOVA model at a glance. Making DD-plot does not demand the statistical model assumption about error terms in regression model. Several examples show the usefulness of DD-plots as a graphical exploratory data analysis tool for the regression analysis.

FACTORS AFFECTING PATIENTS' DECISION-MAKING FOR DENTAL PROSTHETIC TREATMENT

  • Jung, Hyo-Kyung;Kim, Han-Gon
    • The Journal of Korean Academy of Prosthodontics
    • /
    • v.46 no.6
    • /
    • pp.610-619
    • /
    • 2008
  • STATEMENT OF PROBLEM: Factors affecting patients' decision-making for dental prosthetic treatment should be examined in terms of understanding improving patients' oral health. PURPOSE: The main purpose of this dissertation was to investigate patients' dental prosthetic treatment and factors affecting patients' decision-making for dental prosthesis treatment in Deagu and Gyungbook areas. MATERIAL AND METHODS: This study was based on the preliminary survey of dental patients conducted from July 1 to August 31 in 2006. A total of 700 questionnaires had been distributed and 640 were collected. 629 questionnaires were used for the statistical analysis. Descriptive and inferential statistics, such as frequencies, cross tabulation analysis, correlation analysis, logistic regression analysis, and multiple regression analysis were introduced. In the multiple regression analysis and logistic regression analysis, twenty-two independent variables were employed to explore the factors which have impacts on decision-making and satisfaction. RESULTS: The results of this dissertation are as follows: Logistic regression analysis turned out that monthly income, age, degree of expectation, marital status, and employer-insured policy of national insurance statistically increased the odds of decision-making of dental prosthesis treatment. But educational attainment decreased the odds ratio of the decision-making of dental prosthesis treatment. However, the rest independent variables do not have statistically significant impacts on the decision-making of dental prosthesis treatment CONCLUSION: Among independent variables, marital status had the most significant influence on the decision making of dental prosthesis treatment. Finally, suggestions for the future study and policy implications to improve satisfaction of the patients' dental prosthetic treatment were discussed.