• 제목/요약/키워드: Linear Regression Fit

검색결과 138건 처리시간 0.034초

On relationship among h value, membership function, and spread in fuzzy linear regression using shape-preserving operations

  • Hong, Dug-Hun
    • 한국지능시스템학회:학술대회논문집
    • /
    • 한국지능시스템학회 2008년도 춘계학술대회 학술발표회 논문집
    • /
    • pp.306-310
    • /
    • 2008
  • Fuzzy regression, a nonparametric method, can be quite useful in estimating the relationships among variables where the available data are very limited and imprecise. It can also serve as a sound methodology that can be applied to a variety of management and engineering problems where variables are interacting in an uncertain, qualitative, and fuzzy way. A close examination of the fuzzy regression algorithm reveals that the resulting possibility distribution of fuzzy parameters, which makes this technique attractive in a fuzzy environment, is dependent upon an h parameter value. The h value, which is between 0 and 1, is referred to as the degree of fit of the estimated fuzzy linear model to the given data, and is subjectively selected by a decision maker (DM) as an input to the model. The selection of a proper value of h is important in fuzzy regression, because it determines the range of the posibility ditributions of the fuzzy parameters. In this paper, we discuss the interdependent relationship among the h value, membership function shape, and the spreads of fuzzy parameters in fuzzy linear regression with fuzzy input-output using shape-preserving operations.

  • PDF

Relationship Among h Value, Membership Function, and Spread in Fuzzy Linear Regression using Shape-preserving Operations

  • Hong, Dug-Hun
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제8권4호
    • /
    • pp.306-311
    • /
    • 2008
  • Fuzzy regression, a nonparametric method, can be quite useful in estimating the relationships among variables where the available data are very limited and imprecise. It can also serve as a sound methodology that can be applied to a variety of management and engineering problems where variables are interacting in an uncertain, qualitative, and fuzzy way. A close examination of the fuzzy regression algorithm reveals that the resulting possibility distribution of fuzzy parameters, which makes this technique attractive in a fuzzy environment, is dependent upon an h parameter value. The h value, which is between 0 and 1, is referred to as the degree of fit of the estimated fuzzy linear model to the given data, and is subjectively selected by a decision maker (DM) as an input to the model. The selection of a proper value of h is important in fuzzy regression, because it determines the range of the posibility ditributions of the fuzzy parameters. In this paper, we discuss the interdependent relationship among the h value, membership function shape, and the spreads of fuzzy parameters in fuzzy linear regression with fuzzy input-output using shape-preserving operations.

Bayesian Curve-Fitting in Semiparametric Small Area Models with Measurement Errors

  • Hwang, Jinseub;Kim, Dal Ho
    • Communications for Statistical Applications and Methods
    • /
    • 제22권4호
    • /
    • pp.349-359
    • /
    • 2015
  • We study a semiparametric Bayesian approach to small area estimation under a nested error linear regression model with area level covariate subject to measurement error. Consideration is given to radial basis functions for the regression spline and knots on a grid of equally spaced sample quantiles of covariate with measurement errors in the nested error linear regression model setup. We conduct a hierarchical Bayesian structural measurement error model for small areas and prove the propriety of the joint posterior based on a given hierarchical Bayesian framework since some priors are defined non-informative improper priors that uses Markov Chain Monte Carlo methods to fit it. Our methodology is illustrated using numerical examples to compare possible models based on model adequacy criteria; in addition, analysis is conducted based on real data.

고소성 해성점토지반의 압축지수에 대한 비교 연구 (A Comparison Study on Compression Index of Marine Clay with High-Plasticity)

  • 정길수;박병수;홍영길;유남재
    • 산업기술연구
    • /
    • 제25권A호
    • /
    • pp.57-65
    • /
    • 2005
  • In this paper, for the highly plastic marine soft clay distributed in west and southern coast of Korean peninsula of Kwangyang and Busan New Port areas, correlation between compression index and other indices representing geotechnical engineering properties such as liquid limit, void ratio and natural water content were analyzed. Appropriate empirical equations of being able to estimate the compressibility of clays in the specific areas were proposed and compared with other existing empirical ones. For analyses of the data and test results, data for marine clays were used from areas of the South Container Port of the Busan New Port, East Breakwater, Passenger Quay, Jungma Reclamation and Reclamation Containment in the 3rd stage in Kwangyang. In order to find the best regression model by using the commercially available software, MS EXCEL 2000, results obtained from the simple linear regression analysis, using the values of liquid limit, initial void ratio and natural water content as independent variables, were compared with the existing empirical equations. Multiple linear regression was also performed to find the best fit regression curves for compression index and other soil properties by combining those independent variables. On the other hands, another software of SPSS for non-linear regression was used to analyze the correlations between compression index and other soil properties.

  • PDF

다이아몬드 구매가격 예측을 위한 통계적 단순 선형회기 최적화 모형에 관한 연구 (Study on the Statistical Optimum Model of Simple Linear Regression to Estimate the Purchasing Price of Diamond)

  • 이영욱
    • 정보학연구
    • /
    • 제3권1호
    • /
    • pp.37-44
    • /
    • 2000
  • 다이아몬드 구매 예측 가격은 캐럿, 색깔, 투명도, 품질등급, 절단상태 및 캐럿 당 $ 가격의 6가지 요소에 의하여 영향을 받는다. 본 연구의 목적은 이러한 구매가격을 예측하기 위한 선형 회기모형을 구하고 이를 통계적 방법으로 검증하는데 있다. 최적화 모형은 부적격 검정결과 통계적으로 적합성을 갖는 $^y{\;}:{\;}10^2{\;}/{\;}(-1.5575{\;}+{\;}0.3099{\;}logx){\;}+{\;}{\varepsilon}$ 의 단순 회기모형으로 정규 분포성, 등분산성 및 대칭성의 특성을 갖는다.

  • PDF

An Optimal Model Prediction for Fruits Diseases with Weather Conditions

  • Ragu, Vasanth;Lee, Myeongbae;Sivamani, Saraswathi;Cho, Yongyun;Park, Jangwoo;Cho, Kyungryong;Cho, Sungeon;Hong, Kijeong;Oh, Soo Lyul;Shin, Changsun
    • 스마트미디어저널
    • /
    • 제8권1호
    • /
    • pp.82-91
    • /
    • 2019
  • This study provides the analysis and prediction of fruits diseases related to weather conditions (temperature, wind speed, solar power, rainfall and humidity) using Linear Model and Poisson Regression. The main goal of the research is to control the method of fruits diseases and also to prevent diseases using less agricultural pesticides. So, it is needed to predict the fruits diseases with weather data. Initially, fruit data is used to detect the fruit diseases. If diseases are found, we move to the next process and verify the condition of the fruits including their size. We identify the growth of fruit and evidence of diseases with Linear Model. Then, Poisson Regression used in this study to fit the model of fruits diseases with weather conditions as an input provides the predicted diseases as an output. Finally, the residuals plot, Q-Q plot and other plots help to validate the fitness of Linear Model and provide correlation between the actual and the predicted diseases as a result of the conducted experiment in this study.

Semiparametric and Nonparametric Modeling for Matched Studies

  • Kim, In-Young;Cohen, Noah
    • 한국통계학회:학술대회논문집
    • /
    • 한국통계학회 2003년도 추계 학술발표회 논문집
    • /
    • pp.179-182
    • /
    • 2003
  • This study describes a new graphical method for assessing and characterizing effect modification by a matching covariate in matched case-control studies. This method to understand effect modification is based on a semiparametric model using a varying coefficient model. The method allows for nonparametric relationships between effect modification and other covariates, or can be useful in suggesting parametric models. This method can be applied to examining effect modification by any ordered categorical or continuous covariates for which cases have been matched with controls. The method applies to effect modification when causality might be reasonably assumed. An example from veterinary medicine is used to demonstrate our approach. The simulation results show that this method, when based on linear, quadratic and nonparametric effect modification, can be more powerful than both a parametric multiplicative model fit and a fully nonparametric generalized additive model fit.

  • PDF

한국 프로스포츠 선수들의 연봉에 대한 다변량적 분석 (A Multivariate Analysis of Korean Professional Players Salary)

  • 송종우
    • 응용통계연구
    • /
    • 제21권3호
    • /
    • pp.441-453
    • /
    • 2008
  • 프로스포츠 선수들의 연봉은 선수들의 개인 성적과 팀에 대한 기여도 등으로 결정된다는 가정하에 프로농구와 프로야구 선수들의 전년도 성적으로 다음해 연봉을 예측 분석하였다. 분석에 있어서 data visualization 기법을 통해 변수사이의 관계, 이상점 발견, 모형진단등을 하였다. 다중선형회귀 모형(Multiple Linear Regression)과 트리모형(Regression Tree)을 이용해서 자료를 분석하고 모델간 비교를 했으며, Cross-Validation을 이용해서 최적모델을 선택하였다. 특히, 자동으로 변수선택을 하는 stepwise regression방법을 그냥 사용하기보다는 먼저 설명변수들 사이의 관계나 설명변수와 반응변수 사이의 관계등을 조사하고 나서 이를 통해 선택된 변수들을 가지고 stepwise regression과 regression tree 방법론을 이용해서 적절한 변수 및 최종 모형을 선택하였다. 분석결과, 프로농구의 경우에는 경기당 득점, 어시스트, 자유투 성공수, 경력 등이 중요한 변수였고, 프로야구 투수의 경우에는 경력, 9이닝 당 삼진 수, 방어율, 피홈런 수 등이 중요한 변수였고, 프로야구 타자의 경우에는 경력, 안타 수, FA(자유계약)유무 여부 등이 중요한 변수였다.

다중회귀분석에 의한 하천 월 유출량의 추계학적 추정에 관한 연구 (A Study on Stochastic Estimation of Monthly Runoff by Multiple Regression Analysis)

  • 김태철;정하우
    • 한국농공학회지
    • /
    • 제22권3호
    • /
    • pp.75-87
    • /
    • 1980
  • Most hydro]ogic phenomena are the complex and organic products of multiple causations like climatic and hydro-geological factors. A certain significant correlation on the run-off in river basin would be expected and foreseen in advance, and the effect of each these causual and associated factors (independant variables; present-month rainfall, previous-month run-off, evapotranspiration and relative humidity etc.) upon present-month run-off(dependent variable) may be determined by multiple regression analysis. Functions between independant and dependant variables should be treated repeatedly until satisfactory and optimal combination of independant variables can be obtained. Reliability of the estimated function should be tested according to the result of statistical criterion such as analysis of variance, coefficient of determination and significance-test of regression coefficients before first estimated multiple regression model in historical sequence is determined. But some error between observed and estimated run-off is still there. The error arises because the model used is an inadequate description of the system and because the data constituting the record represent only a sample from a population of monthly discharge observation, so that estimates of model parameter will be subject to sampling errors. Since this error which is a deviation from multiple regression plane cannot be explained by first estimated multiple regression equation, it can be considered as a random error governed by law of chance in nature. This unexplained variance by multiple regression equation can be solved by stochastic approach, that is, random error can be stochastically simulated by multiplying random normal variate to standard error of estimate. Finally hybrid model on estimation of monthly run-off in nonhistorical sequence can be determined by combining the determistic component of multiple regression equation and the stochastic component of random errors. Monthly run-off in Naju station in Yong-San river basin is estimated by multiple regression model and hybrid model. And some comparisons between observed and estimated run-off and between multiple regression model and already-existing estimation methods such as Gajiyama formula, tank model and Thomas-Fiering model are done. The results are as follows. (1) The optimal function to estimate monthly run-off in historical sequence is multiple linear regression equation in overall-month unit, that is; Qn=0.788Pn+0.130Qn-1-0.273En-0.1 About 85% of total variance of monthly runoff can be explained by multiple linear regression equation and its coefficient of determination (R2) is 0.843. This means we can estimate monthly runoff in historical sequence highly significantly with short data of observation by above mentioned equation. (2) The optimal function to estimate monthly runoff in nonhistorical sequence is hybrid model combined with multiple linear regression equation in overall-month unit and stochastic component, that is; Qn=0. 788Pn+0. l30Qn-1-0. 273En-0. 10+Sy.t The rest 15% of unexplained variance of monthly runoff can be explained by addition of stochastic process and a bit more reliable results of statistical characteristics of monthly runoff in non-historical sequence are derived. This estimated monthly runoff in non-historical sequence shows up the extraordinary value (maximum, minimum value) which is not appeared in the observed runoff as a random component. (3) "Frequency best fit coefficient" (R2f) of multiple linear regression equation is 0.847 which is the same value as Gaijyama's one. This implies that multiple linear regression equation and Gajiyama formula are theoretically rather reasonable functions.

  • PDF