• Title/Summary/Keyword: Regressions Model

Search Result 157, Processing Time 0.027 seconds

A note on standardization in penalized regressions

  • Lee, Sangin
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.2
    • /
    • pp.505-516
    • /
    • 2015
  • We consider sparse high-dimensional linear regression models. Penalized regressions have been used as effective methods for variable selection and estimation in high-dimensional models. In penalized regressions, it is common practice to standardize variables before fitting a penalized model and then fit a penalized model with standardized variables. Finally, the estimated coefficients from a penalized model are recovered to the scale on original variables. However, these procedures produce a slightly different solution compared to the corresponding original penalized problem. In this paper, we investigate issues on the standardization of variables in penalized regressions and formulate the definition of the standardized penalized estimator. In addition, we compare the original penalized estimator with the standardized penalized estimator through simulation studies and real data analysis.

ALL POSSIBLE HIERARCHICAL QUADRATIC REGRESSIONS FOR RESPONSE SURFACES

  • KIM SUNG-SOO;KWON SOON-SUN;PARK SUNG-HYUN
    • Journal of the Korean Statistical Society
    • /
    • v.34 no.3
    • /
    • pp.209-218
    • /
    • 2005
  • In response surfaces analysis, we often proceed by supposing that, over a limited region of factor space, a polynomial of only first or second degree might adequately approximate the true function. To find the best subset model, all possible quadratic regressions for response surfaces can be very valuable to get optimum solutions under some reasonable experimentations. However, there is a very hard computational burden to get all possible quadratic regressions. In practice, it is sufficient to consider only hierarchical models. In this paper, we propose an algorithm to get all possible hierarchical quadratic regressions for fitting response surfaces.

A Study on the Relations among Stock Return, Risk, and Book-to-Market Ratio (주식수익률, 위험, 장부가치 / 시장가치 비율의 관계에 관한 연구)

  • Kam, Hyung-Kyu;Shin, Yong-Jae
    • Journal of Industrial Convergence
    • /
    • v.2 no.2
    • /
    • pp.127-147
    • /
    • 2004
  • This paper examines the time-series relations among expected return, risk, and book-to-market(B/M) at the portfolio level. The time-series analysis is a natural alternative to cross-sectional regressions. An alternative feature of the time-series regressions is that they focus on changes in expected returns, not on average returns. Using the time-series analysis, we can directly test whether the three-factor model explains time-varying expected returns better than the characteristic-based model. These results should help distinguish between the risk and mispricing stories. We find that B/M is strongly associated with changes in risk, as measured by the Fama and French(1993) three-factor model. After controlling for changes in risk, B/M contains little additional information about expected returns. The evidence suggests that the three-factor model explains time-varying expected returns better than the characteristic-based model.

  • PDF

Dirichlet Process Mixtures of Linear Mixed Regressions

  • Kyung, Minjung
    • Communications for Statistical Applications and Methods
    • /
    • v.22 no.6
    • /
    • pp.625-637
    • /
    • 2015
  • We develop a Bayesian clustering procedure based on a Dirichlet process prior with cluster specific random effects. Gibbs sampling of a normal mixture of linear mixed regressions with a Dirichlet process was implemented to calculate posterior probabilities when the number of clusters was unknown. Our approach (unlike its counterparts) provides simultaneous partitioning and parameter estimation with the computation of the classification probabilities. A Monte Carlo study of curve estimation results showed that the model was useful for function estimation. We find that the proposed Dirichlet process mixture model with cluster specific random effects detects clusters sensitively by combining vague edges into different clusters. Examples are given to show how these models perform on real data.

CHANGE-POINT ESTIMATION WITH SAMPLE FOURIER COEFFICIENTS

  • Kim, Jae-Hee
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2002.11a
    • /
    • pp.109-114
    • /
    • 2002
  • In this paper we propose a change-point estimator with left and right regressions using the sample Fourier coefficients on the orthonormal bases. The asymptotic properties of the proposed change-point estimator are established. The limiting distribution and the consistency of the estimator are derived.

  • PDF

Statistical Models of Air Temperatures in Seoul (서울시 도시기온 변화에 관한 모델 연구)

  • 김학열;김운수
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.31 no.3
    • /
    • pp.74-82
    • /
    • 2003
  • Under the assumption that the temperature of one location is closely related to land use characteristics around that location, this study is carried out to assess the impact of urban land use patterns on air temperature. In order to investigate the relationship, GIS techniques and statistical analyses are utilized, after spatially connecting urban land use data in Seoul Metropolitan Area with atmospheric data observed at Automatic Weather Stations (AWS). The research method is as follows: (1) To find out important land use factors on temperature, simple linear regressions for a specific time period (pilot study) are conducted with urban land use characteristics, (2) To make a final model, multiple regressions are carried out with those factors and, (3) To verify that the final model could be appled to explain temperature variations beyond the period, the model is extensively used for 5 different time periods: 1999 as a whole; summer in 1999; 1998 as a whole; summer in 1998; August in 1998. The results of simple linear regression models in the pilot study show that transportation facilities and open space area are very influential on urban air temperature variations, which explain 66 and 61 percent of the variations, respectively. However, the other land use variables (residential, commercial, and mixed land use) are found to have weak or insignificant relationship to the air temperatures. Multiple linear regression with the two important variables in the pilot study is estimated, which shows that the model explains 75 percent of the variability in air temperatures with correct signs of regression coefficients. Thus, it is empirically shown that an increase in open space and a decrease in transportation facilities area can leads to the decrease in air temperature. After the final model is extensively applied to the 5 different time periods, the estimated models explain 68 ∼ 75 percent of the variations in the temperatures is significant regression coefficients for all explanatory variables. This result provides a possibility that one air temperature model for a specific time period could be a good model for other time periods near to the period. The important implications of this result to lessen high air temperature we: (1) to expand and to conserve open space and (2) to control transportation-related factors such as transportation facilities area, road pavement and traffic congestion.

An Analysis for the Structural Variation in the Unemployment Rate and the Test for the Turning Point (실업률 변동구조의 분석과 전환점 진단)

  • Kim, Tae-Ho;Hwang, Sung-Hye;Lee, Young-Hoon
    • The Korean Journal of Applied Statistics
    • /
    • v.18 no.2
    • /
    • pp.253-269
    • /
    • 2005
  • One of the basic assumptions of the regression models is that the parameter vector does not vary across sample observations. If the parameter vector is not constant for all observations in the sample, the statistical model is changed and the usual least squares estimators do not yield unbiased, consistent and efficient estimates. This study investigates the regression model with some or all parameters vary across partitions of the whole sample data when the model permits different response coefficients during unusual time periods. Since the usual test for overall homogeneity of regressions across partitions of the sample data does not explicitly identify the break points between the partitions, the testing the equality between subsets of coefficients in two or more linear regressions is generalized and combined with the test procedure to search the break point. The method is applied to find the possibility and the turning point of the structural change in the long-run unemployment rate in the usual static framework by using the regression model. The relationships between the variables included in the model are reexamined in the dynamic framework by using Vector Autoregression.

Comparison of Linear and Nonlinear Regressions and Elements Analysis for Wind Speed Prediction (풍속 예측을 위한 선형회귀분석과 비선형회귀분석 기법의 비교 및 인자분석)

  • Kim, Dongyeon;Seo, Kisung
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.25 no.5
    • /
    • pp.477-482
    • /
    • 2015
  • Linear regressions and evolutionary nonlinear regression based compensation techniques for the short-range prediction of wind speed are investigated. Development of an efficient MOS(Model Output Statistics) is necessary to correct systematic errors of the model, but a linear regression based MOS is hard to manage an irregular nature of weather prediction. In order to solve the problem, a nonlinear and symbolic regression method using GP(Genetic Programming) is suggested for a development of MOS for wind speed prediction. The proposed method is compared to various linear regression methods for prediction of wind speed. Also, statistical analysis of distribution for UM elements for each method is executed. experiments are performed for KLAPS(Korea Local Analysis and Prediction System) re-analysis data from 2007 to 2013 year for Jeju Island and Busan area in South Korea.

Development of Han River Multi-Reservoir Operation Rules by Linear Tracking (선형추적에 의한 한강수계 복합 저수지 계통의 이수 조작기준 작성)

  • Yu, Ju-Hwan
    • Journal of Korea Water Resources Association
    • /
    • v.33 no.6
    • /
    • pp.733-744
    • /
    • 2000
  • Due to the randomness of reservoir inflow and supply demand it is not easy to establish an optimal reservoir operation rule. However, the operation rule can be derived by the implicit stochastic optimization approach using synthetic inflow data with some demand satisfied. In this study the optimal reservoir operation which was reasonably formulated as Linear Tracking model for maximizing the hydro-energy of seven reservoirs system in the Han river was performed by use of the optimal control theory. Here the operation model made to satisfy the 2001st year demand in the capital area inputted the synthetic inflow data generated by multi-site Markov model. Based on the regressions and statistic analyses of the optimal operation results, monthly reservoir operation rules were developed with the seasonal probabilities of the reservoir stages. The comparatively larger dams which would have more controllability such as Hwacheon, Soyanggang, and Chungju had better regressions between the storages and outflows. The effectiveness of the rules was verified by the simulation during actually operating period.period.

  • PDF

AN INVESTIGATION OF THE KOREAN GENERAL INSURANCE INDUSTRY: EVIDENCE OF STRUCTURAL CHANGES AND IMPACT OF MACRO-ECONOMIC FACTORS ON LOSS RATIOS

  • Thompson, Ephraim Kwashie;Kim, So-Yeun
    • East Asian mathematical journal
    • /
    • v.38 no.5
    • /
    • pp.617-641
    • /
    • 2022
  • In this study, we first present a brief overview of the Korean general insurance market. We then explore the characteristics of the loss ratios of the Korean general insurance industry and apply Markov regime-switching methodology to model the loss ratios of these insurance companies by line of business based on changes in economic regimes. This study applies a number of confirmatory tests such as Zivot-Andrews test (2002), the Chow (1960) test and the Bai and Perron (1998) to confirm the presence of structural breaks in the time series of the loss ratios by line of business. Then, we employ Markov regime-switching methodology to model these loss ratios. We find empirical evidence that the loss ratios reported by insurance companies in Korea is characterized by two distinct regimes; a regime with high volatility and a regime with low volatility, except for vehicle insurance. Our analyses suggest that macro-economic conditions have significant explanatory effect on loss ratios but the direction of effect differs based on the line of business and the regime. Unlike previous studies that have applied linear regressions or divided the samples into different periods and then apply linear regressions to model loss ratios, we argue for the application of Markov regime-switching methodology, which are able to automatically distinguish the different regimes that may be associated with the movements of loss ratios based on differing economic conditions and regulatory upheavals. This study provides a more in depth understanding of loss ratios in the general insurance industry and will be of value to insurance practitioners in modelling the loss ratios associated with their businesses to aid in their decision making. The results may also provide a basis for further studies in other markets apart from Korea as well as for shaping policy decisions related to loss ratios.