• Title/Summary/Keyword: Linear regression fit

Search Result 140, Processing Time 0.023 seconds

On relationship among h value, membership function, and spread in fuzzy linear regression using shape-preserving operations

  • Hong, Dug-Hun
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2008.04a
    • /
    • pp.306-310
    • /
    • 2008
  • Fuzzy regression, a nonparametric method, can be quite useful in estimating the relationships among variables where the available data are very limited and imprecise. It can also serve as a sound methodology that can be applied to a variety of management and engineering problems where variables are interacting in an uncertain, qualitative, and fuzzy way. A close examination of the fuzzy regression algorithm reveals that the resulting possibility distribution of fuzzy parameters, which makes this technique attractive in a fuzzy environment, is dependent upon an h parameter value. The h value, which is between 0 and 1, is referred to as the degree of fit of the estimated fuzzy linear model to the given data, and is subjectively selected by a decision maker (DM) as an input to the model. The selection of a proper value of h is important in fuzzy regression, because it determines the range of the posibility ditributions of the fuzzy parameters. In this paper, we discuss the interdependent relationship among the h value, membership function shape, and the spreads of fuzzy parameters in fuzzy linear regression with fuzzy input-output using shape-preserving operations.

  • PDF

Relationship Among h Value, Membership Function, and Spread in Fuzzy Linear Regression using Shape-preserving Operations

  • Hong, Dug-Hun
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.8 no.4
    • /
    • pp.306-311
    • /
    • 2008
  • Fuzzy regression, a nonparametric method, can be quite useful in estimating the relationships among variables where the available data are very limited and imprecise. It can also serve as a sound methodology that can be applied to a variety of management and engineering problems where variables are interacting in an uncertain, qualitative, and fuzzy way. A close examination of the fuzzy regression algorithm reveals that the resulting possibility distribution of fuzzy parameters, which makes this technique attractive in a fuzzy environment, is dependent upon an h parameter value. The h value, which is between 0 and 1, is referred to as the degree of fit of the estimated fuzzy linear model to the given data, and is subjectively selected by a decision maker (DM) as an input to the model. The selection of a proper value of h is important in fuzzy regression, because it determines the range of the posibility ditributions of the fuzzy parameters. In this paper, we discuss the interdependent relationship among the h value, membership function shape, and the spreads of fuzzy parameters in fuzzy linear regression with fuzzy input-output using shape-preserving operations.

A Study on the Aperiod Bearing Only TMA (비주기 Bearing 표본입력에 대한 BOTMA 연구)

  • 이동훈
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.4 no.1
    • /
    • pp.30-40
    • /
    • 2001
  • This paper presents a study on the design and simulation of bearing only target motion analysis to enhance the TMA capability using SONAR in underwater environment. A bearing only target motion analysis algorithm using aperiod bearing input signals has been developed and simulated in the MATLAB.

  • PDF

Bayesian Curve-Fitting in Semiparametric Small Area Models with Measurement Errors

  • Hwang, Jinseub;Kim, Dal Ho
    • Communications for Statistical Applications and Methods
    • /
    • v.22 no.4
    • /
    • pp.349-359
    • /
    • 2015
  • We study a semiparametric Bayesian approach to small area estimation under a nested error linear regression model with area level covariate subject to measurement error. Consideration is given to radial basis functions for the regression spline and knots on a grid of equally spaced sample quantiles of covariate with measurement errors in the nested error linear regression model setup. We conduct a hierarchical Bayesian structural measurement error model for small areas and prove the propriety of the joint posterior based on a given hierarchical Bayesian framework since some priors are defined non-informative improper priors that uses Markov Chain Monte Carlo methods to fit it. Our methodology is illustrated using numerical examples to compare possible models based on model adequacy criteria; in addition, analysis is conducted based on real data.

A Comparison Study on Compression Index of Marine Clay with High-Plasticity (고소성 해성점토지반의 압축지수에 대한 비교 연구)

  • Jung, Gil-Soo;Park, Byung-Soo;Hong, Young-Kil;Yoo, Nam-Jae
    • Journal of Industrial Technology
    • /
    • v.25 no.A
    • /
    • pp.57-65
    • /
    • 2005
  • In this paper, for the highly plastic marine soft clay distributed in west and southern coast of Korean peninsula of Kwangyang and Busan New Port areas, correlation between compression index and other indices representing geotechnical engineering properties such as liquid limit, void ratio and natural water content were analyzed. Appropriate empirical equations of being able to estimate the compressibility of clays in the specific areas were proposed and compared with other existing empirical ones. For analyses of the data and test results, data for marine clays were used from areas of the South Container Port of the Busan New Port, East Breakwater, Passenger Quay, Jungma Reclamation and Reclamation Containment in the 3rd stage in Kwangyang. In order to find the best regression model by using the commercially available software, MS EXCEL 2000, results obtained from the simple linear regression analysis, using the values of liquid limit, initial void ratio and natural water content as independent variables, were compared with the existing empirical equations. Multiple linear regression was also performed to find the best fit regression curves for compression index and other soil properties by combining those independent variables. On the other hands, another software of SPSS for non-linear regression was used to analyze the correlations between compression index and other soil properties.

  • PDF

Study on the Statistical Optimum Model of Simple Linear Regression to Estimate the Purchasing Price of Diamond (다이아몬드 구매가격 예측을 위한 통계적 단순 선형회기 최적화 모형에 관한 연구)

  • 이영욱
    • The Journal of Information Technology
    • /
    • v.3 no.1
    • /
    • pp.37-44
    • /
    • 2000
  • The purchasing estimate price of diamond is affected by the factors of carat, color, clarity, certificate, cut and price with the unit of $/carat. The object of this study is to obtain the linear regression model for such purchasing estimate price and to test statistically. The optimum model is the simple regression model of $^y{\;}:{\;}10^2{\;}/{\;}(-1.5575{\;}+{\;}0.3099{\;}logx){\;}+{\;}{\varepsilon}$ statistically satisfied by the lack of fit test and has the characteristics of normality, constant variance and symmetry.

  • PDF

An Optimal Model Prediction for Fruits Diseases with Weather Conditions

  • Ragu, Vasanth;Lee, Myeongbae;Sivamani, Saraswathi;Cho, Yongyun;Park, Jangwoo;Cho, Kyungryong;Cho, Sungeon;Hong, Kijeong;Oh, Soo Lyul;Shin, Changsun
    • Smart Media Journal
    • /
    • v.8 no.1
    • /
    • pp.82-91
    • /
    • 2019
  • This study provides the analysis and prediction of fruits diseases related to weather conditions (temperature, wind speed, solar power, rainfall and humidity) using Linear Model and Poisson Regression. The main goal of the research is to control the method of fruits diseases and also to prevent diseases using less agricultural pesticides. So, it is needed to predict the fruits diseases with weather data. Initially, fruit data is used to detect the fruit diseases. If diseases are found, we move to the next process and verify the condition of the fruits including their size. We identify the growth of fruit and evidence of diseases with Linear Model. Then, Poisson Regression used in this study to fit the model of fruits diseases with weather conditions as an input provides the predicted diseases as an output. Finally, the residuals plot, Q-Q plot and other plots help to validate the fitness of Linear Model and provide correlation between the actual and the predicted diseases as a result of the conducted experiment in this study.

Semiparametric and Nonparametric Modeling for Matched Studies

  • Kim, In-Young;Cohen, Noah
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2003.10a
    • /
    • pp.179-182
    • /
    • 2003
  • This study describes a new graphical method for assessing and characterizing effect modification by a matching covariate in matched case-control studies. This method to understand effect modification is based on a semiparametric model using a varying coefficient model. The method allows for nonparametric relationships between effect modification and other covariates, or can be useful in suggesting parametric models. This method can be applied to examining effect modification by any ordered categorical or continuous covariates for which cases have been matched with controls. The method applies to effect modification when causality might be reasonably assumed. An example from veterinary medicine is used to demonstrate our approach. The simulation results show that this method, when based on linear, quadratic and nonparametric effect modification, can be more powerful than both a parametric multiplicative model fit and a fully nonparametric generalized additive model fit.

  • PDF

A Multivariate Analysis of Korean Professional Players Salary (한국 프로스포츠 선수들의 연봉에 대한 다변량적 분석)

  • Song, Jong-Woo
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.3
    • /
    • pp.441-453
    • /
    • 2008
  • We analyzed Korean professional basketball and baseball players salary under the assumption that it depends on the personal records and contribution to the team in the previous year. We extensively used data visualization tools to check the relationship among the variables, to find outliers and to do model diagnostics. We used multiple linear regression and regression tree to fit the model and used cross-validation to find an optimal model. We check the relationship between variables carefully and chose a set of variables for the stepwise regression instead of using all variables. We found that points per game, number of assists, number of free throw successes, career are important variables for the basketball players. For the baseball pitchers, career, number of strike-outs per 9 innings, ERA, number of homeruns are important variables. For the baseball hitters, career, number of hits, FA are important variables.

A Study on Stochastic Estimation of Monthly Runoff by Multiple Regression Analysis (다중회귀분석에 의한 하천 월 유출량의 추계학적 추정에 관한 연구)

  • 김태철;정하우
    • Magazine of the Korean Society of Agricultural Engineers
    • /
    • v.22 no.3
    • /
    • pp.75-87
    • /
    • 1980
  • Most hydro]ogic phenomena are the complex and organic products of multiple causations like climatic and hydro-geological factors. A certain significant correlation on the run-off in river basin would be expected and foreseen in advance, and the effect of each these causual and associated factors (independant variables; present-month rainfall, previous-month run-off, evapotranspiration and relative humidity etc.) upon present-month run-off(dependent variable) may be determined by multiple regression analysis. Functions between independant and dependant variables should be treated repeatedly until satisfactory and optimal combination of independant variables can be obtained. Reliability of the estimated function should be tested according to the result of statistical criterion such as analysis of variance, coefficient of determination and significance-test of regression coefficients before first estimated multiple regression model in historical sequence is determined. But some error between observed and estimated run-off is still there. The error arises because the model used is an inadequate description of the system and because the data constituting the record represent only a sample from a population of monthly discharge observation, so that estimates of model parameter will be subject to sampling errors. Since this error which is a deviation from multiple regression plane cannot be explained by first estimated multiple regression equation, it can be considered as a random error governed by law of chance in nature. This unexplained variance by multiple regression equation can be solved by stochastic approach, that is, random error can be stochastically simulated by multiplying random normal variate to standard error of estimate. Finally hybrid model on estimation of monthly run-off in nonhistorical sequence can be determined by combining the determistic component of multiple regression equation and the stochastic component of random errors. Monthly run-off in Naju station in Yong-San river basin is estimated by multiple regression model and hybrid model. And some comparisons between observed and estimated run-off and between multiple regression model and already-existing estimation methods such as Gajiyama formula, tank model and Thomas-Fiering model are done. The results are as follows. (1) The optimal function to estimate monthly run-off in historical sequence is multiple linear regression equation in overall-month unit, that is; Qn=0.788Pn+0.130Qn-1-0.273En-0.1 About 85% of total variance of monthly runoff can be explained by multiple linear regression equation and its coefficient of determination (R2) is 0.843. This means we can estimate monthly runoff in historical sequence highly significantly with short data of observation by above mentioned equation. (2) The optimal function to estimate monthly runoff in nonhistorical sequence is hybrid model combined with multiple linear regression equation in overall-month unit and stochastic component, that is; Qn=0. 788Pn+0. l30Qn-1-0. 273En-0. 10+Sy.t The rest 15% of unexplained variance of monthly runoff can be explained by addition of stochastic process and a bit more reliable results of statistical characteristics of monthly runoff in non-historical sequence are derived. This estimated monthly runoff in non-historical sequence shows up the extraordinary value (maximum, minimum value) which is not appeared in the observed runoff as a random component. (3) "Frequency best fit coefficient" (R2f) of multiple linear regression equation is 0.847 which is the same value as Gaijyama's one. This implies that multiple linear regression equation and Gajiyama formula are theoretically rather reasonable functions.

  • PDF