• Title/Summary/Keyword: Variance Inflation Factors

Search Result 15, Processing Time 0.029 seconds

Tests for homogeneity of proportions in clustered binomial data

  • Jeong, Kwang Mo
    • Communications for Statistical Applications and Methods
    • /
    • v.23 no.5
    • /
    • pp.433-444
    • /
    • 2016
  • When we observe binary responses in a cluster (such as rat lab-subjects), they are usually correlated to each other. In clustered binomial counts, the independence assumption is violated and we encounter an extra-variation. In the presence of extra-variation, the ordinary statistical analyses of binomial data are inappropriate to apply. In testing the homogeneity of proportions between several treatment groups, the classical Pearson chi-squared test has a severe flaw in the control of Type I error rates. We focus on modifying the chi-squared statistic by incorporating variance inflation factors. We suggest a method to adjust data in terms of dispersion estimate based on a quasi-likelihood model. We explain the testing procedure via an illustrative example as well as compare the performance of a modified chi-squared test with competitive statistics through a Monte Carlo study.

A Study on the Selection of Pricing Factors for Used Bulk Carriers (중고 벌크선의 가격결정요인 선정에 관한 연구)

  • Yang, Yun-Ok
    • Journal of Navigation and Port Research
    • /
    • v.41 no.4
    • /
    • pp.181-188
    • /
    • 2017
  • In the existing ship sales market, prices determined based on the prices of similar ship types that recently traded. ince the 2008 financial crisis, ship prices have fluctuated, and ship price criteria have become ever more necessary to the imminent value of the ship. Therefore, this research used the hedonic price model to estimate imminent values of ships. In this study, the influence on ship prices was analyzed by the value of each characteristic and an estimated functional formula was. Out of the four models suggested by the hedonic price model, an optimal model was selected with variance inflation factors and a stepwise selection. For this, the influence of determinants of ship prices was analyzed based on actually traded ships and characteristic data. The selected model s the Log-Line model; as a result of regression analysis, eight variables, including DWT, Age, Market Value, Short-Term Charter, Long-Term Charter, Enbloc, Special Survey Due and Builder were to affect the ship price model. This model is expected to be useful for objective and balanced ship price evaluation.

Prediction of Food Franchise Success and Failure Based on Machine Learning (머신러닝 기반 외식업 프랜차이즈 가맹점 성패 예측)

  • Ahn, Yelyn;Ryu, Sungmin;Lee, Hyunhee;Park, Minseo
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.4
    • /
    • pp.347-353
    • /
    • 2022
  • In the restaurant industry, start-ups are active due to high demand from consumers and low entry barriers. However, the restaurant industry has a high closure rate, and in the case of franchises, there is a large deviation in sales within the same brand. Thus, research is needed to prevent the closure of food franchises. Therefore, this study examines the factors affecting franchise sales and uses machine learning techniques to predict the success and failure of franchises. Various factors that affect franchise sales are extracted by using Point of Sale (PoS) data of food franchise and public data in Gangnam-gu, Seoul. And for more valid variable selection, multicollinearity is removed by using Variance Inflation Factor (VIF). Finally, classification models are used to predict the success and failure of food franchise stores. Through this method, we propose success and failure prediction model for food franchise stores with the accuracy of 0.92.

A Study on Work-Related Musculoskeletal Disorders Related to Sonographer's (진단 초음파 검사자의 작업 관련 근골격계질환 연구)

  • An, Hyun
    • Journal of radiological science and technology
    • /
    • v.45 no.4
    • /
    • pp.355-363
    • /
    • 2022
  • This study was to investigate the prevalence rate of musculoskeletal disorders in relation to general characteristic factors, living environment factors, and work environment factors for sonographer's. For the response questions, the guidelines for musculoskeletal burden work were used. For statistical analysis, SPSS 26.0 version was used. For the common body parts of the sonographer's who responded, the prevalence was investigated by dividing the group into a group with high pain or discomfort and a group with low pain or discomfort according to the degree to which they experienced symptoms during the past 12 months. Multiple logistic regression analysis was used to determine the variance inflation factor(VIF), odds ratio (OR) and corresponding 95% confidence interval (CI). A p-value of <0.05 was considered statistically significant. As a result, housework hours, examination history, regular physical activity, number of patient examinations per day, and sitting posture were investigated as variables for rate musculoskeletal disorders. The sonographer's occupational group was found to have a high prevalence rate of musculoskeletal disorders like various other occupational groups. Based on the results of this study, it is judged that musculoskeletal disorders can be reduced by recognizing musculoskeletal disorders and improving work environment factors.

Using Ridge Regression to Improve the Accuracy and Interpretation of the Hedonic Pricing Model : Focusing on apartments in Guro-gu, Seoul (능형회귀분석을 활용한 부동산 헤도닉 가격모형의 정확성 및 해석력 향상에 관한 연구 - 서울시 구로구 아파트를 대상으로 -)

  • Koo, Bonsang;Shin, Byungjin
    • Korean Journal of Construction Engineering and Management
    • /
    • v.16 no.5
    • /
    • pp.77-85
    • /
    • 2015
  • The Hedonic Pricing model is the predominant approach used today to model the effect of relevant factors on real estate prices. These factors include intrinsic elements of a property such as floor areas, number of rooms, and parking spaces. Also, The model also accounts for the impact of amenities or undesirable facilities of a property's value. In the latter case, euclidean distances are typically used as the parameter to represent the proximity and its impact on prices. However, in situations where multiple facilities exist, multi-colinearity may exist between these parameters, which can result in multi-regression models with erroneous coefficients. This research uses Variance Inflation Factors(VIF) and Ridge Regression to identify these errors and thus create more accurate and stable models. The techniques were applied to apartments in Guro-gu of Seoul, whose prices are impacted by subway stations as well as a public prison, a railway terminal and a digital complex. The VIF identified colinearity between variables representing the terminal and the digital complex as well as the latitudinal coordinates. The ridge regression showed the need to remove two of these variables. The case study demonstrated that the application of these techniques were critical in developing accurate and robust Hedonic Pricing models.

A Study on the Factors Affecting the Arson (방화 발생에 영향을 미치는 요인에 관한 연구)

  • Kim, Young-Chul;Bak, Woo-Sung;Lee, Su-Kyung
    • Fire Science and Engineering
    • /
    • v.28 no.2
    • /
    • pp.69-75
    • /
    • 2014
  • This study derives the factors which affect the occurrence of arson from statistical data (population, economic, and social factors) by multiple regression analysis. Multiple regression analysis applies to 4 forms of functions, linear functions, semi-log functions, inverse log functions, and dual log functions. Also analysis respectively functions by using the stepwise progress which considered selection and deletion of the independent variable factors by each steps. In order to solve a problem of multiple regression analysis, autocorrelation and multicollinearity, Variance Inflation Factor (VIF) and the Durbin-Watson coefficient were considered. Through the analysis, the optimal model was determined by adjusted Rsquared which means statistical significance used determination, Adjusted R-squared of linear function is scored 0.935 (93.5%), the highest of the 4 forms of function, and so linear function is the optimal model in this study. Then interpretation to the optimal model is conducted. As a result of the analysis, the factors affecting the arson were resulted in lines, the incidence of crime (0.829), the general divorce rate (0.151), the financial autonomy rate (0.149), and the consumer price index (0.099).

An Analysis of Factors Relating to Agricultural Machinery Farm-Work Accidents Using Logistic Regression

  • Kim, Byounggap;Yum, Sunghyun;Kim, Yu-Yong;Yun, Namkyu;Shin, Seung-Yeoub;You, Seokcheol
    • Journal of Biosystems Engineering
    • /
    • v.39 no.3
    • /
    • pp.151-157
    • /
    • 2014
  • Purpose: In order to develop strategies to prevent farm-work accidents relating to agricultural machinery, influential factors were examined in this paper. The effects of these factors were quantified using logistic regression. Methods: Based on the results of a survey on farm-work accidents conducted by the National Academy of Agricultural Science, 21 tentative independent variables were selected. To apply these variables to regression, the presence of multicollinearity was examined by comparing correlation coefficients, checking the statistical significance of the coefficients in a simple linear regression model, and calculating the variance inflation factor. A logistic regression model and determination method of its goodness of fit was defined. Results: Among 21 independent variables, 13 variables were not collinear each other. The results of a logistic regression analysis using these variables showed that the model was significant and acceptable, with deviance of 714.053. Parameter estimation results showed that four variables (age, power tiller ownership, cognizance of the government's safety policy, and consciousness of safety) were significant. The logistic regression model predicted that the former two increased accident odds by 1.027 and 8.506 times, respectively, while the latter two decreased the odds by 0.243 and 0.545 times, respectively. Conclusions: Prevention strategies against factors causing an accident, such as the age of farmers and the use of a power tiller, are necessary. In addition, more efficient trainings to elevate the farmer's consciousness about safety must be provided.

Development of Ridge Regression Model of Pollutant Load Using Runoff Weighted Value Based on Distributed Curve-Number (분포형 CN 기반 토지피복별 유출가중치를 이용한 오염부하량 능형회귀모형 개발)

  • Song, Chul Min;Kim, Jin Soo
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.60 no.1
    • /
    • pp.111-120
    • /
    • 2018
  • The purpose of this study was to develop a ridge regression (RR) model to estimate BOD and TP load using runoff weighted value. The concept of runoff weighted value, based on distributed curve-number (CN), was introduced to reflect the impact of land covers on runoff. The estimated runoff depths by distributed CN were closer to the observed values than those by area weighted mean CN. The RR is a technique used when the data suffers from multicollinearity. The RR model was developed for five flow duration intervals with the independent variables of daily runoff discharge of seven land covers and dependent variables of daily pollutant load. The RR model was applied to Heuk river watershed, a subwatershed of the Han river watershed. The variance inflation factors of the RR model decreased to the value less than 10. The RR model showed a good performance with Nash-Sutcliffe efficiency (NSE) of 0.73 and 0.87, and Pearson correlation coefficient of 0.88 and 0.93 for BOD and TP, respectively. The results suggest that the methods used in the study can be applied to estimate pollutant load of different land cover watersheds using limited data.

An Influential Relationship between Urban Culture and Community Spirit (도시문화와 공동체 의식의 영향 관계)

  • Kim, Dong-Yoon
    • Journal of The Korean Digital Architecture Interior Association
    • /
    • v.13 no.4
    • /
    • pp.51-60
    • /
    • 2013
  • With regard to urban culture this study aims to essential understanding and systematic approach to the culture. The "2012 Seoul Survey" report has been used to find out causality among the related variables. In the first place 'satisfaction of cultural condition' was operationally selected as a dependent variable for regression. For the purpose of controlling the third factors for ceteris paribus effect correlation analysis was previously done between the dependent variable and all other variables respectively, which resulted in two groups of variables: group (1) - 2 variables of very significant correlations(p-value<0.01) and (2) - the other 6 variables of significant correlations(p-value<0.05). Then hierarchical regression was adopted to these 2 groups to analyse statistical significance of independent variables, and multicollinearity(VIF; variance inflation factor). Additionally to OLS robust and bootstrapping regressions were done to confirm the validity of this model specification. At last a regression model specified by group (1) as independent variables(they are 'community spirit caring for women, the disabled, the poor and the old,' 'satisfaction of bicycle riding condition' shows that the variables have statistically significant and substantially strong effect on 'satisfaction of cultural condition.' This finding implies the following understanding; (1) urban festivals are regarded as the main of the urban culture as of now and this results from the low level of today's culture, (2) culture is telling and hearing stories but the influential relationship between urban culture and community spirit on the weak is negative, which says that the cultural perception among citizen is somewhat selfish and far from the essential understanding of the urban culture. In spite of restrictive external validity this finding can be used as a direction for promoting culture and a basis for related policy choice in cities.

A Causality between Cultural Satisfaction and Social Trust in Cities (도시인의 문화환경 만족과 사회적 신뢰의 인과성)

  • Kim, Dong-Yoon
    • Journal of The Korean Digital Architecture Interior Association
    • /
    • v.12 no.4
    • /
    • pp.49-57
    • /
    • 2012
  • With regard to the culture in cities this study aims to essential understanding and systematic approach to the culture. The "2011 Seoul Survey"report has been used to find out causality among the related variables. In the first place 'satisfaction of cultural condition' was operationally selected as a dependent variable for regression. For the purpose of controlling confounding factors for ceteris paribus effect correlation analysis was done between the dependent variable and all other variables respectively, which resulted in two groups of variables: group (1) - 6 variables of very significant correlations(p-value<0.01) and (2) - the other 6 variables of significant correlations(p-value<0.05). Then hierarchical regression was adopted to these 2 groups to analyse $R^2$ increment, statistical significance of independent variables, and multicollinearity(VIF; variance inflation factor). At last a regression model specified by group (1) as independent variables(they are 'social trust', 'satisfaction of walking condition', 'happiness index', 'preparation against old age', 'satisfaction of traffic condition' and 'hours for internet') shows that only 'social trust' variable has statistically significant and substantially strong effect on 'satisfaction of cultural condition.' This finding should be accepted on the following understanding; (1) urban culture has a collective attribute formed between people and society, (2) culture is somewhat telling and hearing stories and the confidence between tellers and hearers is essential in the mutual response and (3) stimulus is received by relationship in company with sense, emotion, thinking and action. In spite of restrictive external validity this finding can be used as a direction for promoting culture and a basis for related policy choice in cities.