• Title/Summary/Keyword: Normal linear regression model

Search Result 86, Processing Time 0.025 seconds

The skew-t censored regression model: parameter estimation via an EM-type algorithm

  • Lachos, Victor H.;Bazan, Jorge L.;Castro, Luis M.;Park, Jiwon
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.3
    • /
    • pp.333-351
    • /
    • 2022
  • The skew-t distribution is an attractive family of asymmetrical heavy-tailed densities that includes the normal, skew-normal and Student's-t distributions as special cases. In this work, we propose an EM-type algorithm for computing the maximum likelihood estimates for skew-t linear regression models with censored response. In contrast with previous proposals, this algorithm uses analytical expressions at the E-step, as opposed to Monte Carlo simulations. These expressions rely on formulas for the mean and variance of a truncated skew-t distribution, and can be computed using the R library MomTrunc. The standard errors, the prediction of unobserved values of the response and the log-likelihood function are obtained as a by-product. The proposed methodology is illustrated through the analyses of simulated and a real data application on Letter-Name Fluency test in Peruvian students.

Performance Improvement of Classification Between Pathological and Normal Voice Using HOS Parameter (HOS 특징 벡터를 이용한 장애 음성 분류 성능의 향상)

  • Lee, Ji-Yeoun;Jeong, Sang-Bae;Choi, Hong-Shik;Hahn, Min-Soo
    • MALSORI
    • /
    • no.66
    • /
    • pp.61-72
    • /
    • 2008
  • This paper proposes a method to improve pathological and normal voice classification performance by combining multiple features such as auditory-based and higher-order features. Their performances are measured by Gaussian mixture models (GMMs) and linear discriminant analysis (LDA). The combination of multiple features proposed by the frame-based LDA method is shown to be an effective method for pathological and normal voice classification, with a 87.0% classification rate. This is a noticeable improvement of 17.72% compared to the MFCC-based GMM algorithm in terms of error reduction.

  • PDF

Log-density Ratio with Two Predictors in a Logistic Regression Model (로지스틱 회귀모형에서 이변량 정규분포에 근거한 로그-밀도비)

  • Kahng, Myung Wook;Yoon, Jae Eun
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.1
    • /
    • pp.141-149
    • /
    • 2013
  • We present methods for studying the log-density ratio that enables the selection of the predictors and the form to be included in the logistic regression model. Under bivariate normal distributional assumptions, we investigate the form of the log-density ratio as a function of two predictors. If two covariance matrices are equal, then the crossproduct and quadratic terms are not needed. If the variables are uncorrelated, we do not need the crossproduct terms, but we still need the linear and quadratic terms. We also explore other conditions in which the crossproduct and quadratic terms are not needed in the logistic regression model.

Study on Estimating the Optimal Number-right Score in Two Equivalent Mathematics-test by Linear Score Equating (수학교과의 동형고사 문항에서 양호도 향상에 유효한 최적정답율 산정에 관한 연구)

  • 홍석강
    • The Mathematical Education
    • /
    • v.37 no.1
    • /
    • pp.1-13
    • /
    • 1998
  • In this paper, we have represented the efficient way how to enumerate the optimal number-right scores to adjust the item difficulty and to improve item discrimination. To estimate the optimal number-right scores in two equivalent math-tests by linear score equating a measurement error model was applied to the true scores observed from a pair of equivalent math-tests assumed to measure same trait. The model specification for true scores which is represented by the bivariate model is a simple regression model to inference the optimal number-right scores and we assume again that the two simple regression lines of raw scores and true scores are independent each other in their error models. We enumerated the difference between mean value of $\chi$* and ${\mu}$$\_$$\chi$/ and the difference between the mean value of y*and a+b${\mu}$$\_$$\chi$/ by making an inference the estimates from 2 error variable regression model. Furthermore, so as to distinguish from the original score points, the estimated number-right scores y’$\^$*/ as the estimated regression values of true scores with the same coordinate were moved to center points that were composed of such difference values with result of such parallel score moving procedure as above mentioned. We got the asymptotically normal distribution in Figure 5 that was represented as the optimal distribution of the optimal number-right scores so that we could decide the optimal proportion of number-right score in each item. Also by assumption that equivalence of two tests is closely connected to unidimensionality of a student’s ability. we introduce new definition of trait score to evaluate such ability in each item. In this study there are much limitations in getting the real true scores and in analyzing data of the bivariate error model. However, even with these limitations we believe that this study indicates that the estimation of optimal number right scores by using this enumeration procedure could be easily achieved.

  • PDF

Statistical Estimated Model of Chronological Change in Physical Growth and Development in Korean Youth(17 Years Old) - From 1983 To 1993 - (한국 청소년(만 17세) 체격의 시대적 변천에 대한 통계적 모형 추정 -1983년부터 1993년까지-)

  • 성웅현;윤석옥;윤태영;최중명;박순영
    • Korean Journal of Health Education and Promotion
    • /
    • v.12 no.2
    • /
    • pp.36-47
    • /
    • 1995
  • This research was obtained from analyzing how the physiques of the 3rd grade students of high school for males and females and developed for the last eleven years(from 1983 to 1993). By the physiques and nutritional index of physical growth and development, Relative Body Weight of 36.62 exceeded the standard, on the other hand females showed lower records than the standard. Relative Chest Girth Index belonged to the normal type of males and females in all, in the comparison of the records between 1983 and 1993, males increased in average 0.29 and females in average 0.55. Relative Chest Girth Index of females was greater than that of females. By the results of Relative Sitting Height Index, growth of the lower body for males and females was greater than that of males. In case of Vervaeck Index, males increased in average 2.04 but females increased in average 1, 20 relatively less than males. These phenomena provided for the evidence of the deficient nutrition in females. In the regression models of body height and body weight within a certain period, statistical regression model types which best indicated chronological average changes of body height and body weight, took 3rd Order Polynomial Regression Model rather than linear regression model. In females, statistical regression model types which best is suitable for chronological average change of body height and body weight, took 4th and 2nd Order Polynomial Regression Model respectively. The prediction value of 1995 by estimated polynomial regression model anticipated that body height of 3rd grade year students of high school of males in 1993 went on increasing from 170.87cm to 171.79cm in average 0.92cm growth and that of females from 158.99cm to 160.79cm in average 1.80cm growth. In addition, body weight of males seemed to increase from 62.58kg to 64.52kg in average 1.94kg growth and that of females seemed to increase from 54.05kg to 54.19kg in average 0.14kg growth. Linear Regression Model was suitable for the regression model of body weight for body height. Prediction on increase of an average body weight for body height was that, according to growth of body height 1cm in males, body weight increased 1.41kg averagely and that of females 0.86kg. For that reason, we came to conclusion that body weight increase for body height 1cm in males was greater than that in females on average.

  • PDF

Performance Comparison of Mahalanobis-Taguchi System and Logistic Regression : A Case Study (마할라노비스-다구치 시스템과 로지스틱 회귀의 성능비교 : 사례연구)

  • Lee, Seung-Hoon;Lim, Geun
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.39 no.5
    • /
    • pp.393-402
    • /
    • 2013
  • The Mahalanobis-Taguchi System (MTS) is a diagnostic and predictive method for multivariate data. In the MTS, the Mahalanobis space (MS) of reference group is obtained using the standardized variables of normal data. The Mahalanobis space can be used for multi-class classification. Once this MS is established, the useful set of variables is identified to assist in the model analysis or diagnosis using orthogonal arrays and signal-to-noise ratios. And other several techniques have already been used for classification, such as linear discriminant analysis and logistic regression, decision trees, neural networks, etc. The goal of this case study is to compare the ability of the Mahalanobis-Taguchi System and logistic regression using a data set.

Electricity Price Forecasting in Ontario Electricity Market Using Wavelet Transform in Artificial Neural Network Based Model

  • Aggarwal, Sanjeev Kumar;Saini, Lalit Mohan;Kumar, Ashwani
    • International Journal of Control, Automation, and Systems
    • /
    • v.6 no.5
    • /
    • pp.639-650
    • /
    • 2008
  • Electricity price forecasting has become an integral part of power system operation and control. In this paper, a wavelet transform (WT) based neural network (NN) model to forecast price profile in a deregulated electricity market has been presented. The historical price data has been decomposed into wavelet domain constitutive sub series using WT and then combined with the other time domain variables to form the set of input variables for the proposed forecasting model. The behavior of the wavelet domain constitutive series has been studied based on statistical analysis. It has been observed that forecasting accuracy can be improved by the use of WT in a forecasting model. Multi-scale analysis from one to seven levels of decomposition has been performed and the empirical evidence suggests that accuracy improvement is highest at third level of decomposition. Forecasting performance of the proposed model has been compared with (i) a heuristic technique, (ii) a simulation model used by Ontario's Independent Electricity System Operator (IESO), (iii) a Multiple Linear Regression (MLR) model, (iv) NN model, (v) Auto Regressive Integrated Moving Average (ARIMA) model, (vi) Dynamic Regression (DR) model, and (vii) Transfer Function (TF) model. Forecasting results show that the performance of the proposed WT based NN model is satisfactory and it can be used by the participants to respond properly as it predicts price before closing of window for submission of initial bids.

Regional Drought Frequency Analysis with Estimated Monthly Runoff Series in the Nakdong River Basin (낙동강 유역의 유역 유출량 산정에 따른 지역별 가뭄 빈도분석)

  • 김성원
    • Magazine of the Korean Society of Agricultural Engineers
    • /
    • v.41 no.5
    • /
    • pp.53-67
    • /
    • 1999
  • In this study, regional frequency analysis is used to determine each subbasin drought frequency with watershed runoff which is calculated with Tank Model in Nakdong river basin. L-Monments methd which is almost unbiased and nearly normal distribution is applied to estimate paramers of drought frequency analysis of monthly runoff time series. The duration of '76-77 was the most severe drought year than othe rwater years in this study. To decide drought frequency of each subbasin from the main basin, it is calculated by interpolaing runoff from the frequency-druoght runoff relationship. and the linear regression analysis is accomplished between drought frequency of main basin and that of each subbasin. With the results of linear regression analysis, the drought runoff of each subbasin is calculated corresponing to drought frequency 10,20 and 30 years of Nakdong river basin considering safety standards for the design of impounding facilities. As the results of this study, the proposed methodology and procedure of this study can be applied to water budget analysis considering safety standards for the design of impounding facilities in the large-scale river basin. For this purpose, above all, it is recommanded that expansion of reliable observed runoff data is necessary instead of calculated runoff by rainfall-runoff conceptual model.

  • PDF

Development of Railway Vibration Evaluation System Using Actual Railway Vibration Database (실측 철도 진동 데이터베이스를 이용한 철도진동 평가 시스템 개발)

  • Lee, Hyunjun;Seo, Eun Seong;Hwang, Young Sup
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.4
    • /
    • pp.153-162
    • /
    • 2019
  • Recently, it is necessary to develop a technology for quantitatively evaluating railway vibration to prevent civil complaints about orbital structures caused by railway noise and normal operation of ultra-precise equipment of orbital industrial complexes. The existing analytical method requires a very complicated dynamic response model, and it is difficult to secure the reliability of the result due to the inaccuracy of the demand model. Therefore, in this paper, we propose a railway vibration evaluation algorithm and system that deduce the vibration value generated from railway operation by using Linear Regression and Gradient Descent technique based on actual measurement railway vibration database that classifies factors affecting railway vibration. The prediction results obtained by the proposed algorithm show higher efficiency and accuracy than the existing analytical methods.

Curve Estimation among Citation and Centrality Measures in Article-level Citation Networks (문헌 단위 인용 네트워크 내 인용과 중심성 지수 간 관계 추정에 관한 연구)

  • Yu, So-Young
    • Journal of the Korean Society for information Management
    • /
    • v.29 no.2
    • /
    • pp.193-204
    • /
    • 2012
  • The characteristics of citation and centrality measures in citation networks can be identified using multiple linear regression analyses. In this study, we examine the relationships between bibliometric indices and centrality measures in an article-level co-citation network to determine whether the linear model is the best fitting model and to suggest the necessity of data transformation in the analysis. 703 highly cited articles in Physics published in 2004 were sampled, and four indicators were developed as variables in this study: citation counts, degree centrality, closeness centrality, and betweenness centrality in the co-citation network. As a result, the relationship pattern between citation counts and degree centrality in a co-citation network fits a non-linear rather than linear model. Also, the relationship between degree and closeness centrality measures, or that between degree and betweenness centrality measures, can be better explained by non-linear models than by a linear model. It may be controversial, however, to choose non-linear models as the best-fitting for the relationship between closeness and betweenness centrality measures, as this result implies that data transformation may be a necessary step for inferential statistics.